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(i) APPLICANT: {Other thm US) THE WALTER AND ELIZA HALL INSTITUTE OF 
MEDICAL RESEARCH 
{US Only) 
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{2} INFORMATION FOR SEQ ID JSOils 

{1} SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
{B} TYPE: nucleic acid 
iC.) STRANDKDNESS : single 
(Di TOPOLOGY; linear 

{iij MOLECULE TYPE ; DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOsl: 



CACGCCGCCC ACGTGAAGGC 2 0 



{2) INFORMATION FOB: SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS; 
(&5 LENGTH: 20 base pairs 
(B) TYPE: nucleic acid 
(Ci STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DBA 

{JfiS SEQUENCE DESCRIPTION: SEQ. ID K0i2: 
TTCGCCAATG ACAAGACGCT 20 



!2) INFORMATION FOR SEQ ID NO; 3: 

U) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 1236 base pairs 

(B) TYPE: nucleic acid 
<C) STRARDEDNBSS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE ; DNA 

fix) FEATURE: 

(A) NAME /KEY: CDS 
{S} LOCATION: 1. ,636 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3; 



CGAGGCTCAA GCTCCGGGCG GATTCTGCGT GCCGCTCTCG CTCCTTGGGG TCTGT'TGGCC -101 
GGCCTGTGCC ACCCGGACGC CCGGCTCACT GCCTCTGTCT CCCCC&TCAG CGCAGCCCCG -41 
GACGCTATGG CCCACCCCTC CAGCTGGCCC CTC6AGTAGG -1 



ATG GTA GCA CGC AAC CAG GTG GCA GCC GAC AAT GCG ATC TCC CGG OCA 4P 
Met Val Ala Arcs Asm Gin Val Ala Ala Asp Asn Ala lie Ser Pro Ala 



15 



GCA GAG CCC CGA CGG CGG TCA GAG CCC TCC TCG TCC 
Ala Gin Pro Arg Arg Arg Ser Glu Pro Ser Ser Ser 
20 25 



TCG TCT TCG TCC 
Ser Ser Ser Ser 
30 



TCG CCA GCG GCC CCC GTG COT CCC CGG CCC TGC CCG 



GCG GTC CCA GCC 



144 
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Ser Pro Ala Ala Pro Val Arc Pro Arg Pro Cys Pro Ala v&l Pro Ala 
35 40 45 

CCA GCC CCT GGC GAC ACT CAC TTC CGC ACC TTC CGC TCC CAC TCC GAT 192 
Pro Ala Pro Gly Asp Thr His Phe Arg Thr Phe Arg Ser His Ser Aso 
50 55 60 

TAC CGG CGC ATC ACG CGG ACC AGC GCG CTC CTG GAC GCC TGC GGC TTC 240 
Tyr Arg Arg He Thr Arg Thr Ser Ala Leu Leu Asp Ala Cys Gly Phe 
65 70 75 80 

TAT TGG GGA CCC CTG AGC GTG CAC GGG GCG CAC GAG CGG CTG CGT GCC 288 
Tyr Trp Gly Pro Leu Ser Val His Gly Ala His Glu Arg Leu hxa Ala 
85 90 95 

GAG CCC GTG GGC ACC TTC TSQ GTG CGC GAC AG? CGT CAA CGG AAC TGC 336 
Glu Pro Val Gly Thr Phe Leu Val Arg Asp Ser Arg Gin Arc Asn Cys 
100 105 110 

TTC TTC GCG CTC AGC GTG AAG ATG OCT TCG GGC CCC ACG AGC ATC CGC 384 
Phe Phe Ala Leu Ser Val Lys Met Ala Ser Gly Pro Thr Ser lie Arc 
115 120 125 

GTG CAC TTC CAG GCC GGC CGC TTC CAC TVS GAC GGC AGC CGC GAG ACC 432 
Val ais Phe Gin Ala Gly Arg Phe His Leu Asp Gly Ser Arg Glu Thr 
130 135 140 

TTC GAC TGC CTT TTC GAG CTG CTG GAG CAC TAC GTG GCG GCG CCG CGC 430 
Phe Asp Cys Leu Phe Glu Leu Leu Glu His Tyr Val Ala Ala Pro Arc 
.1.45 150 155 160 

CGC ATG TTG GGG GCC CCG CTG CGC CAG CGC CGC GTG CGG CCG CTG CAG 528 
Arg Met Leu Gly Ala Pro Leu Arg Gin Arg Arg Val Arc? pro Leu Gin 
165 170 175 

GAG CTG TGT CGC CAG CGC ATC GTG GCC GCC GTG GGT CGC GAG AAC CTG 576 
Glu Leu Cys Arg Gin Arg He Val Ala Ala Val Gly Arq Glu Asn Leu 
180 185 " 190 

GCG CGC ATC CCT CTT AAC CCG GTA CTC CGT GAC TAC CTG ACT TCC TTC 624 
Ala Arg He Pro Leu Asn Pro Val Leu Ara Aso Tyr Leu Ser Ser Phe 
195 200 * 205 

CCC TTC CAG ATC TGA CCGGCTG CCGCTGTGCC GCAGCATTAA GTGGGGGCGC 676 
Pro Phe Gin He * 

210 



CTTATTATTT 


CTTATTATTA 


ATTATTATTA 


TTTTTCTGGA 


ACCACGTGOG 


AGCCCTCCCC 


736 


GCCTGGGTCG 


GAGGGAGTGG 


TTGTGGAGGG 


TGAGATCCCT 


CCCACTTCTG 


GCTGGAGACC 


796 


TCATCCCACC 


TCTCAGGGGT 


GGGGGTGCTC 


C CCTC C\TGQ t 


GCTCCCTCCG 


GGTCCCCCCT 


856 


GGTTGTAGCA 


GCTTGTGTCT 


GGGGCCAGGA 


CCTGAATTCC 


ACTCCTACCT 


CTCCATGTTT 


91 6 


ACATATTCCC 


AGTATCTTTG 


CACAAACCAG 


GGGTCGGGGA 


GGGTCTCTGG 




97 h 


CTGC TGTGC A 


GAATATCCTA 


TTTTATATTT 


TTACAGCCAG 


TTTAGGTAAT 


AAACTTTATT 


1036 


ATGAAAGTTT 


TTTTTTAAAA 


GAAAAAAAAA 


AAAAAAAAA 






1075 



INFORMATION FOR SEQ ID MO; 4: 

{iS SEQUENCE CHARACTERISTICS: 

I A} LENGTH; 212 amino acids 
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£B) TYPE: amino acid 
tD3 TOPOLOGY: linear 

•ii) MOLECULE TYPE ; protein 

(3ti) SSQOCTJCE DESCRIPTION r SEQ ID SO; 4: 



Mec Vai Ala Arg hsn Gin Val Ala Ala Asp Asm Ala He Ser Pro Ala 
1 5 10 is 

Ala Ola Pro Arg Arg Arg Ser gXu Pro Ser Ser Ser Ser Ser Ser Ser 
20 25 30 

Ser Pro Ala Ala Fro Val Arg Fro Arq Pro Cys Pro Ala Val Pro Ala 
3 5 4 0 45 

Pro Ala Pro Gly Asp Thr His phe Arg Thr Phe Arg Ser His Ser Asp 
50 55 60 

Tyr Arg Arg lis Thr Arg Thr Ser Ala Leu Leu Asp Ala Cys Gly Phe 
65 70 75 80 

Tyr Trp Gly Pro Leu Ser Val His Gly Ala His Glu Arg Leu Art? Ala 
85 90 95 

Glu Pro Val Gly Thr' Phe Leu Val hXQ Asp Ser Arg Gin Arg Asn Cys 
1C0 105 110 

Phe Phe Ala Leu Ser Val Lys Met Ala Ser Gly Pro Thr Ser He Arg 
115 120 125 

Val His Phe Gin Ala Gly Arg Phe His Leu Asp Gly Ser Arg Glu Thr 
130 135 140 

Phe Asp Cys Leu Phe Glu Leu Leu Glu His Tyr Val Ala Ala Pro Arcr 
145 150 155 160 

Arg .Met Leu Gly Ala Pro Leu Arg Gin Arg Arg Val Arg Pro Leu Gin 
165 170 175 

Glu Lea Cys Arg Gin Arg He Vai Ala Ala Vai Gly Arg Glu Asn Leu 
180 185 * ISO 

Ala Arq lie Pro Leu Asn Pro Val Leu Arg Asp Tyr Leu Ser Ser Phe 
195 200 205 

Pro Phe Gin He 
210 



(25 INFORMATION FOP. SEQ ID HO: 5: 

ii) SBQOEjS'CE CHARACTERISTICS: 

(A) LENGTH .- 1121 base pairs 
iB) TYPE: nucleic acid 
CC} STRANDSDKESSj single 
(D! TOPOLOGY; linear 

{ii) MOLECULE TYPS; DNA 



(ix) FBATURS : 

(A) KAME/KEY: CDS 

<B) LOCATION; 22 3.. 819 
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(xi) SEQUENCE DESCRIPTION; SEQ ID MO: 5: 

GCGATCTGTG GGTGACAGTO TCTGCGAGAG ACTTTGGCAC ACCATTCTGC CGGAATTTGG 60 

AGAAAAAGAA CCAGCCGCTT CCAGTCCCCT CCCCCTCCGC CACCATTTCG 6ACACCCTGC 120 

ACACTCTCGT TTTGGGGTAC CCTGTGACTT CCAGGCAGCA CGCGAOGTCC ACTGGCCCCA ISO 

GCTCGGGCGA CCAGCTGTCT GGGACGTGTT GACTCATCTC CC ATG ACC CTG CGG 234 

Met Thr Ley Arg 
1 

TGC CTG GAG CCC TCC GGG AAT GGA GCG GAC AGG ACG CGG AGC CAG TOO 282 
Cys Leu G.1.u Pro Ser Gly Asn Gly Ala Asp Arg Thr Arg Ser Gin Trp 
3 10 15 20 

GGG ACC GCG GOG TTG CCG GAG GAA CAG TCC CCC GAG GCG GCG CGT CTG 330 
Gly Thr Ala Gly Leu Pro Glu Glu Gin Ser Pro Glu Ala Ala Arc; Leu 
25 30 35 

GCG AAA GCC CTG CGC GAG CTC AGT CAA ACA GGA TGG TAG TGG GGA ACT 378 
Ala Lys Ala Levi Arg Glu Leu Ser Gin Thr Glv Trp Tvr Tro Gly Ser 
40' 45 50 

ATG ACT GTT AAT GAA GCC AAA GAG AAA TTA AAA GAG GCT CCA GAA GGA 426 
Met Thr Val Ask Glu Ala Lys Glu Lys Leu Lys Glu Ala Pro Glu Gly 
53 60 65 

ACT TTC TTG ATT AGA GAT AGT TCG CAT TCA GAC TAG CTA CTA ACT ATA 474 
Thr Phe Leu lie Arg Asp Ser Ser His Ser Aso Tyr Leu Leu Thr lie 
70 75 80 

TCC GTT AAG ACG TCA GCT GGA CCG ACT AAC CTG CGG ATT GAG TAC CAA 522 
Ser Val Lys Thr Ser Ala Gly Pro Thr Asn Leu Arg lie Glu Tyr Gin 
85 90 95 100 

GAT GGG AAA TTC AGA TTG GAT TCT ATC ATA TGT GTC AAG TCC AAG CTT 570 
Asp Gly Lys Phe Arg Leu Asp Ser lie lis Cys Val Lys Ser Lys Leu 
105 110 US 

AAA CAG 'ITT GAC AGT GTG GTT CAT CTG ATT GAC TAC TAT GTC CAG ATG 61 S 

Lys Gin Phe Asp Ser Val Val His Leu 11a Ast> Tyr Tyr Val Gin Met 
120 125 130 

TGC AAG GAT AAA CGG ACA GGG CCA GAA GCC CCA CGG AAT GGG ACT GTT 655 
Cys Lvs Asp Lvs Arg Thr Gly Pro Glu Ala Pro Arg Ask Gly Thr Val 
135 140 145 

CAC CTG TAC CTG ACC AAA, CCT CTG TAT ACA TCA GCA CCC ACT CTG CAG 714 
His Leu Tyr Leu Thr Lys Pro Leu Tvr Thr Ser Ala Pro Thr Leu Gin 
150 155 160 

CAT TTC TGT CGA CTC GCC ATT AAC AAA TGT ACC GGT ACG ATC TGG GGA 762 
His Phe Cys Arg Leu Ala lie Asn Lys Cys Thr Gly Thr lie Trp Glv 
165 170 175 180 

CTG CCT TTA CCA ACA AGA CTA AAA GAT TAC TTG GAA GAA TAT AAA TTC 810 
Leu Pro Leu Pro Thr Arg Leu Lys Asp Tyr Leu Glu Glu Tyr Lys Phe 
185 190 195 

CAG GTA TAAGTATTTC TCTCTCTTTT TCGTTTTTTT TTAAAAAAAA AAAA&CACAT 866 
Gin Val 

GCCTCATATA GACTATCTCC GAATGCAGCT ATGTGAAAGA GAACCCAGAG GCCCTCCTCT 926 

GGATAACTGC GCAGAATTCT CTCTTAAGGA CAGTTGGGCT CAGTCTAACT TAAAGGTGTG 986 
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MGATGTAGC TAGGTATTTT AAAGTTCCCC TTAGGTAGTT TTAGCTGAAT GATGCTTTCT 1046 
TTCCTATGGC 'TGCTCAAGAT CAAATGGCCC TTTTAAATOA AACAAAACAA AACAAAACAA 1106 
AAAAAAAAAA AAAAA 1121 

(2) INFORMATION FOR SEQ ID NO: 6: 

Hi SEQ08SC3 CHARACTERISTICS ; 

(A) LENGTH: 198 amino acids 
(8) TYPE: amino acid 
(DS TOPOLOGY: linear 

(ii) MOLECULE TYPE ; protein 

txi) SEQUENCE DESCRIPTION: SEQ 10 NO: 6: 

Met Thr Leu Arg Cys Leu Glu Pre Ser Gly Asn Gly Ala Asp Arg Thr 
1 5 10 15 

Arg Ser Gin Trp Gly Thr Ala Gly Leu Pro Glu Giu Gin Ser Pro Glu 
20 25 30 

Ala Ala Arg Levi Ala Lys Aia Leu Arg Giu Leu Ser Gin Thr Gly Trp 
35 40 45 

Tyr Trp Gly Ser Mec Thr Vai Asn Glu Ala Lys Glu Lys Leu Lys Giu 
50 55 60 

Ala Pro Glu Gly Thr Phe Leu He Arg Asp Ser Ser His Ser As© Tyr 
65 70 75 " 80 

I»«u Leu Thr lie Sex Val Lys Thr Ser Ala Gly Pro Thr Asn Leu Arg 
85 90 95 

He Glu Tyr Gin Asp Gly Lys Phe Arg Leu Asp Ser lie He Cys Val 
100 105 110 

Lys Ser Lys Leu Lys Gin Phe Asp Ser Val Val His Leu He Asp Tyr 
115 120 125 

Tyr Val Gin Met Cys Lys Asp Lys Arg Thr Gly Pro Giu Ala Pro Arg 
130 135 140 

Asn Gly Thr Val His Leu Tyr Leu Thr Lys Pro Leu Tyr Thr Ser Ala 
145 150 155 ' 160 

Pro Thr Leu Gin His; Phe Cys Arg Leu Ala lie Asn Lys Cys Thr Gly 
165 170 175 

Thr He Trp Gly Leu Pro Leu Pro Thr Arg Leu Lys As© Tvr Leu Glu 
180 1S5 " 190 

Giu Tyr Lys Pfaa Gin Val 
195 

i2) INFORMATION FOR SEQ I'D NO: 7; 

; i J SEQUENCE: CHARACTERISTICS ; 

(A) LENGTH: 2187 base pairs 

(B) TYPE; nucleic acid 
(C> STRANQEDHE33; single 
(Di TOPOLOGY ; linear 

(iij MOLECULE TYPE: DNA 
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tixj PEA-TORE ; 

(A) NAME/KEY: CDS 

(B) LOCATION: 13., 635 

fxi.) SEQUENCE DESCRIPTION; SEQ ID NOs?* 

CGCTGGCTCC GTGCGCC ATG GTC ACC CAC AGC AAG TTT CCC GCC GCC GGG 50 
Met Val Thr His Ser Lys Phe Pro Ala Ala Gly 
1 5 10 

ATG AGC CGC CCC CTG GAC ACC AGC CTG CGC CTC AAG ACC TTC AGC TCC 98 
Met Ser Arg Pro Leu A0p Thr Ser Leu Arg Leu Lys Thr Phe Ser Sftr 
15 20 25 

AAA AGC GAG TAG CAG CTG GTG GTG AAC GCC GTG CGC AAG CTG CAG GAG 146 
Lys Ser Glu Tyr Gin Leu Val Val Asn Ala Val Arg Lys Leu Gin Glu 
30 35 40 

AGC GGA TTC TAC TOG AGC GCC GTG ACC GGC GGC GAG GCG AAC CTG CTG 194 
ser Gly Phe Tyr Trp Ser Ala Val Thr Gly Gly Glu Ala Asn Leu Leu 
45 50 55 

CTC AGC GCC GAG CCC GCG GGC ACC TTT CTT ATC CGC GAC AGC TCG GAC 24.2 
Leu Ser Ala Glu Pro Ala Gly Thr Phe Leu He Arg Asp Ser Ser Asp 
60 65 70 75 

CAG CGC CAC TTC TTC ACG TTG AGC GTC AAG ACC CAG TCG GGG ACC AAG 290 
Gin Arg Kis Phe Phe Thr Leu Ser val Lys Thr Gin Ser Gly Thr Lys 
SO 85 90 

AAC CTA CGC ATC CAG TGT GAG GGG GGC AGC TTT TCG CTG CAG AGT GAC 338 
Asn Leu Arg He Gin Cys Glu Gly Gly Ser Phe Ser Leu Gin Ser Asp 
95 100 105 

CCC CGA AGC ACG CAG CCA GTT CCC CGC TTC GAC TGT GTA CTC AAG CTG 386 
Pro Arg Ser Thr Gin Pro Val Pro Arg Phe Asp Cvs Val Leu Lvs Leu 
110 115 120 

GTG CAC CAC TAC ATG CCG CCT CCA GGG ACC CCC TCC TTT TCT TTG CCA 434 
Val His His Tyr Met Pro Pro Pro Gly Thr Pro Ser Phe Ser Leu Pro 
125 130 135 

CCC ACG GAA CCC TCG TCC GAA GTT CCG GAG CAG CCA CCT GCC CAG GCA 482 
Pro Thr Glu Pro Ser Ser Glu Val Pro Glu Gin Pro Pro Ala Gin Ala 
140 145 150 155 

CTC CCC GGG AGT ACC CCC AAG AGA OCT TAC TAC ATC TAT TCT GGG GGC 530 
Leu Pro Gly Ser Thr Pro Lys Arg Ala Tyr Tyr He Tyr Ser Gly Gly 
160 165 l'/o 

GAG AAG ATT CCG CTG GTA CTG AGC CGA CCT CTC TCC TCC AAC GTG GCC 578 
Glu Lys lie Pro Leu Val Leu Ser Arg Pro Leu Ser Ser Asn Val kis 
IIS 180 185 

ACC CTC CAG CAT CTT TGT CGG AAG ACT GTC AAC GGC CAC CTG GAC TCC 626 
Thr Leu Gin His Leu Cys Arg Lys Thr Val Asn Glv His Leu Asp Ser 
190 195 " 200 

TAT GAG AAA GTG ACC CAG CTG CCT GGA CCC ATT CGG GAG TTC CTG GAT 674 
Tyr Glu Lys Val Thr Gin Leu Pro Glv Pro He Are Glu Phe Leu Asp 
205 210 215 

CAG TAT GAT GCT CCA CTT TAAGGAGCAA AAGGGTCAGA GGGGGGCCTG 722 
Gin Tyr Asp Ala Pro Leu 
220 225 
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GGTCGG'T'OGG 


TCGCCTCTCC 


TCCGAGGCAC 


ATGGCACAAG 


CACAAAAATC 


CAGCCCCAAC 


782 


GGTCGGTAGC 


TCCCAOTOAG 


CCAGGGGCAG 


A7TCO0TT0T 


TCCTCAGGCC 


CTCCACTCCC 


S42 


GCAGAGTAGA 


GCTGOCAGGA 


CCTGGAATTC 


GTCTGAGGGG 


AGGGGGAGCT 


GCCACCTGCT 


902 


TTCCCCCCTC 


CCCCAGCTCC 


AGCTTCTTTC 


AAGTGGAGCC 


AGCCGGCCTG 


GCCTGGTGGG 


362 


ACAATACCT? 


TGACAAGCGG 


ACTCTCCCCT 


CCCCTTCCTC 


CACACCCCCT 


CTGCTTCCCA 


1022 


AGGGAGGTGG 


QGACACCTCC 


AAGTGTTGAA 


CTTAGAACTG 


CAAGGGGAA'T 


CTTCAAACTT 


1082 


TCCCGCTGGA 


ACTTGTTTGC 


GCTTTGATT? 


GGTTTGATCA 


AGAGCAGGCA 


CCTQGGGGAA 


1142 


GGATGGAAGA 


GAAAAGGGTG 


TGTGAAGGGT 


TTTTATGCTG 


GCCAAAGAAA 


TAACCACTCC 


1202 


CACTGCCCAA 


CCTAGGTGAG 


GAGTGGTGGC 


TCCTGGCTC7 


GGGGAGAGTG 


GCAAGGGGTG 


1262 


ACCTGAAGAO 


AGOTAfACTG 


GTGCCAGGCT 


CCTCTCCATG 


GGGCAGCTAA 


TGAAACCTCG 


1322 


CAGATCCCTT 


GCACCCCAGA 


ACCCTCCCCG 


TTGTGAAGAG 


GCAGTAGCAT 


TTAGAAGGGA 


.1.382 


GACAwATGAG 


>jl.TGGTGA^C 


TGGCCGCCTT 


TTCCAACACC 


GAAGGGAGGC 


AGATCAACAG 


1442 


ATGAGCCATC 


TTGGAGCCCA 


GGTTTCCCCT 


GGAGCAGATG 


GAGGGTTCTG 


CTTTGTCTCT 


1.5(52 


CCTATGTGGG 


GCTAGGAGAC 


TC GCCTTAAA 


TGCCCTCTGT 


CCCAGGGATG 


SGGATTGGCA 


1562 


CACAAGGAGC 


CAAACACAGC 


CAATAGGCAG 


AGAGTTGAGG 


GATTCACCCA 


GGTGGCTACA 


1622 


GGCCAGGGGA 


AGTGGCTGCA 


GGGGAGAGAC 


CCAGTCACTC 


CAGGAGACT'C 


CTGAGTTAAC 


1682 


ACTGGGAAGA 


CATTGGCCAG 


TCCTAGTCAT 


CTCTCGGTCA 


GTAGGTCCGA 


GAGCTTCCAG 


1742 


GCCCTGCACA 


GCCCTCCTTT 


CTCACCTGGG 


GGGAGGCAGG 


AGGTGATGGA 


GAAGCCTTCC 


1802 


CATGCCGCTC 


ACAG&GGCCT 


C'ACGGGAATG 


CAGCAGCCAT 


GCAATTACCT 


GGAACTGGTC 


1862 


CTGTGTTGGG 


GAGAAACAAG 


ttttctgaag 


TC AGG T A TGG 


GGCTGGGTGG 


GGCAGCTGTG 


1922 


TGTTGGGGTG 


GCTTTTTTCT 


CTCTGTTTTG AATAATGTTT 


ACAATTTGCC 


TCAATCACTT 


1982 


T7ATAAAAAT 


CCACCTCCAG 


CCC'GCCCCTC 


TCCCCACTCA GGCCTTCGAG 


GCTGTCTGAA 


2042 


GATGCTTGAA 


AAACTCAACC 


AAATCCCAGT 


TCAACTCAGA CTTTGCACAT ATATTTATAT 


2102 


TTATACTCAG 


AAAAGAAACA 


TTTCAGTAAT TTATAATAAA AGAGCACTAT TTTTTAATGA 


2162 


AAAAAAAAAA 


AAAAAAAAAA 


AAAAA 








2187 



m xmomhnm for sag id m-.B-. 

fi> SEQUENCE CHARACTERISTICS: 

{A! LENGTH : 225 amino acids 
<B) TYPE: amino acid 
{D) TOPOLOGY: linear 

(ii) MOI.EC13.LE TYPE ■ protein 

ixi.) SEQUENCE DESCRIPTION: SEQ ID HO: 8: 

Met Val Thr His Ser I.ys Phs Pro Ala Ala Gly Met Ser Arg Pro Ley 
15 10 15 

Asp Thr Ser Leu Arg Leu Lys Thr Phe Ser Ser Lvs Ser Ola Tyr Gin 
20 25 30 
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hew Val Val Asn Ala Va.l Arg Lys Lea Gin Glu Ser Giy Phe Tyr Trp 
35 40 45 

Ser Ala Val Thr Giy Gly Gl.il Ala Asn Leu Leu Leu Ser Ala Glu Fro 
50 55 60 

Ala Gly Thr Phe Leu Us Arg Asp Ser Ser Asp Gin Arg His Phe Phe 
65 70 75 30 

Thr Leu Ser Val Lys Thr Gin Ser Giy Thr Lys Asii Leu Arg lie Gin 
85 30 95 

Cys Glu Gly Gly Ser Phe Ser Leu Gin Ser Asp Pro Arg Ser Thr Qln. 
100 105 110 

Pro Val Pro Arc; Phe Asp Cys Val Leu Lys Leu Val His His Tyr Met 
115 120 125 

Pro Pro Pro Gly Thr Pro Ser Phe Ser Leu Pro Fro Thr Glu Pro Ser 
130 135 140 

Ser Glu Val Pro Glu Gin Pro Pro Ala Gin Ala Leu Pro Giy Ser Thr 
145 150 155 160 

Pro Lys Arg Ala Tyr Tyr lie Tyr Ser Gly Gly Glu Lys He Pro Leu 
165 170 175 

Val Leu Ser Arg Pro Leu Ser Ser Asii Val Ala Thr Leu Gin Kis Leu 
ISO 185 190 

Cys Arg Lys Thr Val Asn Gly His Leu Asp Ser Tyr Glu Lys Val Thr 
195 2Q0 2 OS 

Gin Leu Pro Giy Pro lie Arg Glu Phe Leu Asp Gin Tyr Asp Ala Pro 
210 215 220 

Leu 



(2) INFORMATION FOP. SEQ ID NO: 3: 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 1094 base pairs 
{B} TYPE: nucleic acid 
iX» TOPOLOGY : linear 

(ii? MOLECULE TYPES: protein 

fxi> SEQUENCE DESCRIPTION; SEQ ID NO; 9: 



CTCCGGCTGG 


CCCCTTCT3T AGGATGGTAG 


CACACAACCA 


GGTGGCAGCC 


GACAATGCAG 


60 


TCTCCACAGC 


AGCAGAGCCC 


CGACGGCGGC 


CAGAACCTTC 


CTCCTCTTCC 


TCCTCCTCGC 


i.^ Q 


CCGCGGCCCC 


CGCGCGCCCG 


CGGCCGTGCC 


CCGCGGTCCC 


GGCCCCGGCC 


CCCGGCGACA 


180 


CGCACTTCCG 


CACATTCCGT 


TCGCACQCCG 


ATTACCGGCG 


CATCACGCGC 


GCCAOCGCGC 


240 


TCCTGGACGC 


CTGCGGATTC 


TACTGGGGGC 


CCCTGAGCGT 


GCACGGGGCG 


CACGAGCGGC 


300 


TGCGCGCCGA 


GCCCGTGGGC 


ACCTTCCTGG 


TGCGCGACAG 


CCGCCAGCGG 


AACTGCTTTT 


360 


TCGCCCTTAG 


CGTGAAGATG 


GCCTCGGGAC 


CCACGAGCAT 


CCGCGTGCAC 


TTTCAGGCCG 


420 


GCCGCTTTCA 


CCTGGATGGC 


AGCCGCGAGA 


GCTTCGACTG 


CCTCTTCGAG 


CTGCTGGAGC 


480 
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ACTACGTGGC GGCGCCGCGC CGCATGCTGG GGSCCCOSCT GCGCCAGCGC CGCGTGCGGC 540 

CGCTGCAGGA GCTGTGCCGC CAGCGCA'i'CG TGOCCACCGT GGGCCGCGAG AACCTGGCTC 600 

GCATCCCCOT CAACCCCGTC CTCCGOSACT ACCTGAGCTC CTTCCCCTTC CAGATTTGAC 660 

CGGCAGCGCC CGCCGT6CAC GCAGCATTAA CTGGGATGCC GTGTTATTTT GTTATTACT? 720 

GCCTGGAACC ATGTGGGTAC CCTCCCCGGC CTGGGTTGGA GGGAGCGGAT GGGTGTAGGG 780 

GCGAGGCGCC TCCCGCCCTC GGCTGGAGAC GAGGCCGCAG ACCCCTTCTC ACCTCTTGAG 840 

GGGGTCCTCC CCCTCC1GOT GCTCCCTCTG GGTCCCCCTG GTTGTTGTAG CAGCTTAACT SCO 

GTATCTGGAG CCAGGACCTG AACTCQCACC TCCTACCTCT TCATG'fTTAC ATATACCCAG SS0 

TATCTTTGCA CAAACCAGGG GTTGGGGGAG GGTCT'CTGGC TTTATTTTTC TGCTGTGCAG 1020 

AATCCTATTT TATATTTTTT AAAGTCAGTT TAGGTAATAA ACTTTATTAT GAAAG'i'TTTT 10 SO 

TTTTOTAAAA AAAA 1094 



{2; XUFOHMATION FOR SEQ ID NO: 10: 

{.-:.) SBQITENCK CHAHACTHRISTICS i 

(A) LENGTH: 211 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) H0L3C0LE TYPE: protein 



Ui> SEQUENCE DESCRIPTION; SEQ IDKOslO: 

Met Val Ala His Asn Gin Vai Ala Ala Asp hsn Ala Val Ser Thr Ala 
1 S 10 15 

Ala Glu Pro Arg Arg Arg Pro Glu Pro Ser Ser Ser Ser Ser Ser Ser 
20 25 30 

Pro Ala Ala Pro Ala Arg Pro Arg Pro Cys Pro Ala Val Pro Ala Pro 
35 40 45 

Ala Pro Gly Asp Thr His Phe Arg Thr Phe Arg Ser His Ala Asp Tyr 
50 55 60 

Arg Arg lis Thr Arg Ala Ser Ala Leu Leu Asp Ala Cys Gly Pha Tyr 
65 70 75 80 

Trp Gly Pro Leu Ser Val His Gly Ala His Glu Arg I»«u Arg Ala Glu 
85 50 95 
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Pro Vol Gly Thr Phe Leu Val Arg Asp Ser Arg Gin Arg Asn Cys Phe 
100 105 110 

Phe Ala Leu Ser Val Lys Met Ala ser Gly Pro Thr Ser lie Arg Val 
115 120 125 

His Phe Gin Ala Gly Arg Phe His Lew Asp Gly Ser Arg Glu Ser Phe 
13 0 135 140 

Asp Cys Leu Phe Glu Leo Leu Glu His Tyr Val Ala Ala Pro hrg Arg 
145 150 155 160 

Met Leu Gly Ala Pro Leu Arg Gin Arg Arg Val Arg Pro Levi Gin Glu 
165 170 175 

Leu Cys Arg Gin Arg He Val Ala Thr Val Gly Arg Glu Asn Leu Ala 
180 185 190 

Arg lie Pro Leu Asn Pro Val Leu Arg Asp Tyr Leu Ser Ser Phe Pro 
195 200 205 

Phe Gin lie 
210 

17.) INFORMATION FOR SBQ 2D NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 2807 base pairs 
{ B) TYPE : nucleic acid 
(D) TOPOLOGY: linear 

iii) MOLEC0LS TYPE: protein 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO ill: 



GGAAACCGAG 


GCGGGGAGAC 


CAGGAGGCCT 


TGGCCTCAGA 


GC T'TCAGAGT 


CGCGTGGCAG 


60 


CAAACAGAGA 


AACCTGTAGA 


GGGCAGTGTG 


CGTCAGTTAG 


CTCAGGGA&G 


CTGCACGCGA 


120 


AACTCACCCG 


CCTTCATTCA 


TAAACATCGT 


CAGCTAGGCA 


CCTACTCCTG 


GGCTTTCAGG 


180 


ACAAACTGAA 


TCACGAAACC 


ACAGTGTCCT 


TAAAATAGGT 


CTGACCGCCT 


GAATCCCTGG 


240 


CCAAGGTGTG 


TACGGGGCAT 


GGGAGCCCTT 


OTGCAGAGAT 


GCTTGCAGGA GCCTTGAGGG 


300 


GCTCTG'FAAG 


ACAGAGGCTA 


GGAAGACAAA 


■^^^GGCyCJC? C 'X 


ACAGCTTCTT 


GTCCTGCCCG 


360 


GGGCCTCAGT 


TTCTTC6GTT 


GCCCACGTAG 


GAGTGCAGAG 


AGTCCAGCCC 


CTGGGGACCC 


420 


AACCCAACCC 


CGCCCAGTTT 


CCGAGGAACT 


CGTCCGGGAG 


CGGGGGCGCC 


CCTCCCGCAC 


480 


CGCCTTAGGC 


TTCCTTTGAA 


GCCTCTGCGG 


TCAGGCCACC 


GCTTCCTGGG 


AAGCCCAAGC 


540 
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CAAGGCCAGG CCGAGTGGCC AACGGGAGGG GCCCGCGCGC S&TTCPGGAS GAGGGCGGCG 600 

GCCCCACAGG TCTCCAGGGC TGGCTAGCCG GGCTCCTAGA GCGGAGACTG CCAAGGCCTT 66 D 

CGGGTCCTGG GCAGGAAGGA TCCTGGCftGG GAGGAGTTSC TTGGGGGGTG GGGGGGAAAG 72 0 

GCTCCAGGCC CGGTGGAGCT C'TGACCAGGA SAATGCACAC ACTCGGAGGG G AGGAG GCGT 780 

GTCAGCCCCA AGCTAGCATC CCACCCGGGG AGCAGCGATG TGGGGCGAAG GTAGCCASAG 840 

CAAAAGAGCA GGCACCAGGT GACACGAAAC AGAAGATTCC GGGTAGAGCC AGAACCCCAG 900 

AAGTCCCATT CAGGSAAGGT GCGAGGCGAG AACGAGTTAG GTGGACCCTC TCCAGGGGCA 9 SO 

GCCAAAGAAA TCTAAAGAGA ACCCGAAGGA GTTGCCGGAA AGAGAAACCG AAAGCGGCGG 1020 

TGGGCGGGAT CGGTGGGCGG GGCCTCCCTG GTTTAAGAGC TTGATGCAGG GGCGGGCAGC 1080 

AGCAGAGAGA AC TGCGGCCG TGGCAGCGGC ACGGCTCCCG GCCCCGGAGC ATGCGCGACA 1140 

GCAGCCCCGG AACCCCCAGC CGCGGCGCCC CGGGTCCCGC CGCCAGGTGA GCCGAGGCAG 1200 

CTGCGAAGGA GCAGGCGGGA GGGGATGGGA GGAAGGGGAG CAGAGCCTGG CAGGACTATC 1260 

CTCGC AGAC T GCATGGCGGG GTCGTGGATG CTATaCCTCJ? GGCGCCCGCC CCACCGGCTG 1320 

GCCCAGGCGG CCCCTCGCGC GCGCGGSGCG GCGTCAGCCC CTCCTCTCCG GCCCTGAGCC 1380 

CGGATCGTCC GCCCGGG'TTC CAGT'TCCCGG CGTGGCCAGT AGGCGGCAAC CGCGAGGCGG 1440 

CAAGCCACCC AGCGGGGACG GCCTGGAGTC GGGCCCCTCT CCACGCCCCC TTCTCCACGC 1SO0 

GCGCGGGGAG GCAGGGCTCC ACCGCCAGTC TGGAAGGGTT CCACATACAG GAACGGCCTA 1560 

CTTCGCAGAT GAGCCCACCG AGGCTCAGGC TCCGGGCGGA TTCT-GCGTGT CACCCTCGCT 1620 

CCTTGGGGTC CGCTGGCCGG CCTGTGCCAC CCGGACGCCC GGTTCACTGC CTCTGTCTCC 1680 

CCCATCAGCG CAGCCCCGGA CGCTATGGCC CACCCCTCCA GCTGGCCCCT CGAGTAGGAT 1740 

GGTAGCACGT AACCAGGTGG AAGCCGACAA TGCGATCTCC CCGGCATCAG AGCCCCGACG 1800 

GCGGCCAGAG CCATCCTCGT CCTCGTCTTC GTCCTCGCCG GCGGCCCCGG CGCGTCCCCG 1860 

GCCCTGCCCG GTGGTCCCGG CCCCGGCTCC GGGCGACACT CACTTCCGCA CC1TOX5CTC 1920 

CCACTCTGAT TACCGGCGCA TCACGCGGAC CAGCGCTCTC CTGGACGCCT GCGGCTTCTA 1980 

CTGGGGACCC CTGAGCGTGC ATGGGGCGCA CGAACGGCTG CGTTCCGAAC CCGTGGGCAC 2040 

CT'fCTTGGTG CGCGACAGTC GCCAGCGGAA CTGCTTCTTC GCGCTGAGCG TGAAGATGGC 2100 

TTCGGGCCCC ACGAGCATTC GTGTGCACTT CCAGGCCGGC CGCTTCCACC TGGACGGCAA 2160 

CCGCGAGACC T TCGACTGCC TCTTCGAGCT GCTGGAGCAC TACGTGGCGG CGCCGCGCCG 2220 

CATGTTGGGG fiCCCCACTQC GCCAGCGCCG CGTGCGGCCG CTGCAGGAGC TGTGTCGCCA 2280 

GCGCATCGTG GCCGCCGTGG GTCGCGAGAA CCTGGCACGC ATCCCTCTTA ACCCGGTACT 2340 

CCGTGACTAC CTGAGTTCCT TCCCCTTCCA GATCTGACCG GCTGCCGCCG TGCCOSCAGA 2400 

ATTAAGTGGG AGOGCCTTAT TATTWTPAT TATTAATTAT TATOATTTTT CTGGAACCAC 2460 

GTGGGAGCCC TCCCCGCCTA GGTCGGA5GG AGTGGGTGTG GAGGGTGAGA TCCCTCCCAC 2 520 

TTCTGGCTGG AGACCTTATC CCGCCTCTCG GGGGGCCTCC CCTCCTGGTG CTCCCTCCCG 2580 

GTCCCCCTGG TTGTAGCAGC TTGTGTCTGG GGCCAGGACC TGAACTCCAC GCCTACCTCT 2640 

CCATGTTTAC ATGTTCCCAG TATCTTTGCA CAAACCAGGG GfOGQGG&GG GTCTCTGGCT 2700 

TCATTTTTC? GCTGTGCAGA ATATTCTATT TTATATTTTT ACATCCAGTT TAGATAATAA 2760 

ACTTTATTAT GAAAGTTTTT TTTTTTAAAG AAACAAAGAT TTCTAGA 280? 



(2) INFORMATION FOR SEQ ID J3G;12; 

U! SEQUENCE CHARACTERISTICS; 

(A) 'LENGTH; 212 strain© acids 
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(B! TYPE: amino acid 

(C) STRAHDEDNESS j single 

(D) TOPOLOGY : linear 

{.115 MOLECULE TYPE; protein 



(Xi) SEQUENCE DESCRIPTION: SBQ ID MO: 12: 

Met Val Ala Arg Asn Gin Val Gin Ala Asp Asn Ala Ha Ser Pro Ala 
1 5 10 is 

Ser Glu Pro Arg Arg Arg Pro Glu Pro Ser Ser Ser Ser Ser Ser Ser 
20 25 30 

Ser Pro Ala Aia Pro Ala Arg Pro Arg Pro Cys Pro Val Val Pro Ala 
35 40 45 

Pro Ala Pro Gly Asp Thr His Phe Arg Thr Phe Arg Ser Bis Ser Asp 
SO 55 60 

Tyr Arg Arg lie Thr Arg Thr Ser Ala Leu Leu Asp Ala. Cys Gly Phe 
65 70 75 80 

Tyr Trp Gly Pro Leu Ser Val His Gly Ala His Gla Arg Leu Arg Ser 
85 90 SS 

Glu Pro Val Gly Thr Pbe Leu Val Arg Asp Ser Arg Gin Arg Asn Cys 
100 105 110 

Phe Phe aia Leu Ser Val Lys Met Ala Ser Gly Pro Thr Ser lie Arg 
115 120 125 

Val His Phe Gin Ala Gly Arg Phs His Leu Asp Gly Asn Arg Glu Thr 
13 0 135 140 

Phe Asp Cys Leu Phe Glu Leu Leu Glu His Tyr Val Ala Aia Pro Arg 
145 ISO 155 160 

Arg Met Leu Gly Ala Pro Law Arg Gin Arg Arg Val Arg Pro Leu Gin 
165 170 175 

Glu Leu Cys Arg Gin Arg lie Val Aia Ala Val Gly Arg Glu Asn Leu 
180 185 190 
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Ala Arg He Pro Leu Asn Pro Val Leu Arg Asp Tyr Leu Ser Ser Phe 
195 200 205 

Pro Phe Gin He 
2X0 



12) INFORMATION FOR SSQ ID NO: 13: 

(i) SEQOSSCa CHARACTERISTICS; 

(A) LSKGTH- 1611 bass pairs 

(B) TYPE; nucleic acid 
{C) STRARDSDNBSS: single 
(D! TOPOLOGY; linear 

Hi) HOLSCOLE TYPE: DNA 



<lx) FEATURE : 

<Af NAME /KEY: CDS 

{S) LOCATION: 263.. 1529 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOil3: 

CGAATTCCGG GCGGGCTGTG TGAGTCTGTG AGTGGAAGGC GCGCCGGCTC TTTTGTCTGA 60 

GTGTGACCCG GTGGCTTTGT TCCAGGCATT CCGGTGATTT CCTCCGGGCA GTCCGCAQAA 120 

GCCGCAGCGG CCGCCCGCGC TCTCTCTGCA GTCTCCACAC CCGGGAGAGC CTGAGCCCGC 180 

GTCACGCCCC TCAGCCCCCG CTGAGTCCCT TCTCTGTTGT CGCGTCCGAA TCGAGTTCCC 240 

GGAATCAGAC GGTGCCCCAT AG ATG GCC AGC TO CCC CCG AGG GOT AAC GAG 292 

Wet Ala Ser Phe Pro Pro Arg Val Asn Glu 
I 5 10 

AAA GAG ATC GTG AGA TCA CGT ACT ATA GGG GAA CTC TTG GCT CCA GCA 340 
Lys Glu lie Val Arg Ser Arg Thr He Gly Glu Leu Leu Ala Pro Ala 
15 20 25 

GCT CCT TTT GAC AAG AAA TGT GOT GOT GAG AAC TGG ACQ GTT GCT TTT 388 
Ala Pro Phe Asp Lys Lys Cys Gly Gly Glu Asa Trp Thr Val Ala Phe 
30' 35 -10 
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GCT CCT GAT GGT TCC TAC TIT GCG TGG TC» CAA GGA TAT CGC ATA GTG 436 
Ala Pro Asp Gly Ser Tyr Phe Ala Trp Ser Gin Gly Tyr Arg lie Vai 
45 50 55 

AAG CPT GTC CCG TGG TCC CAG TCC CGT AAG AAC TTT CTT TTG CAT GGT 484 
Lys Leu Val Pro Trp Ser Gin Cys Arg Lys Asm She Leu Leu His Gly 
60 65 70 

TCC AAA AAT CTT ACC AAT TCA AGC TGT CTA AAA TTG GCA AGA CAA AAC 532 
Ser Lys Asn Val Thr Asn Ser Ser Cys Leu Lys Leu Ala Arg Gin Asm 
75 80 85 90 

ACT AAT GGT GGT CAG AAA AAC AAG CCT CCT GAG CAC GTT ATA GAC TGT 5 SO 

Ser Asn Gly Gly Gin Lys Asn Cys Pro Pro Glu His Val lie Asp Cys 
95 100 105 

GGA GAC ATA GTC TGG AGT CTT GCT TTT GGG TCT TCA GTT CCA GAA AAA 628 
Gly Asp lie Val Trp Ser Leu Ala Phe Gly Ser Ser Val Pro Gly Lys 
110 115 120 

CAG AGT CGT TGC GTT AAT ATA GAA TGG CAT CGG TTC CGA TTT GGA CAG 576 
Gin Ser Arg Cys Val Asn lie Gly Trp His Arg Phe Arg Phe Gly Gin 
125 130 135 

GAT CAG CTA CTC CTT GCC AC A GGA TTA AAC AAT GGT CGC ATC AAA ATC 724 
Asp Gin Leu Leu Leu Ala Thr Gly Leu Asn Asn Gly Arg He Lys He 
140 .145 150 

TGG GAT GTA TAT ACA GGA AAA CTC CTC CTT AAT TTG GTA GAC CAC ATT 772 
Trp Asp Val Tyr Thr Gly Lys Leu Leu Leu Asn Leu Val Asp His He 
155 160 165 170 

GAA ATC GTT AGA OAT TTA ACT TTT GCT CCA GAT GGG AGC TTA CTC CTT 82 0 

Glu Met Vai Arg Asp Leu Thr Phe Ala Pro Asp Gly Ser Leu Leu Leu 
175 ISO 185 

GTA TCA GCT TCA AGA GAC AAA ACT CTA AGA GTG TGG GAC CTG AAA GAT 868 
Val Ser Ala Ser Arg Asp Lys Thr Leu Arg Val Trp Asp Leu Lys Asp 
190 195 200 

GAT GGA AAC ATG GTG AAA GTA TTG CGG GCA CAT CAG AAT TGG GTG TAC 916 
Asp Gly Asn Met Val Lys val Leu Arg Ala His Gin Asn Trp Val Tyr 
205 210 215 

ACT TGT GCA TTC TCT CCC GAC TGT TCT ATG CTG TGT TCA GTG GGC GCC 964 
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Ser Cys Ala Phe Ser Pro Asp Cys Ser Met: Leu cys Ser Val Gly Ala 
22C 525 230 

AGT AAA GCA GTT TTC CTT TOG AAT ATS GAT AAA T'AC ACC ATG ATT AGO 1012 
Ser Lys Ala Val the hen Trp Asn Met Asp Lys Tyr Thr Met He Arg 
235 240 245 250 

AAG CTG GAA GGT CAT CAC CAT GAT GTT GTA GCT TGT GAC TTT TCT OCT 1060 
Lys Leu Glu Gly His His His Asp Val Val Ala Cys Asp Phe Ser Pro 
255 260 265 

GAT GGA GCA TTG CTA GCT ACT GCA TCC TAT GAC ACT CGT GTG TAT GTC 1108 
Asp Gly Ala Leu Leu Ala Thr Ala Ser Tyr Asp Thr Arg Val Tyr Val 
270 275 280 

TGG GAT CCA CAC AAT GGA GAC CTT CTG ATG GAG TTT GGG CAC CTG TTT 1156 
Trp Asp Pro His Asn Gly Asp Leu Leu Met Glu Phe Gly His Lew Phe 
285 290 295 

CCC TCG CCC ACT CCA ATA TTT GCT GGA GGA GCA AAT GAC CGA TGG GTG 1204 
Pro Ser Pro Thr Pro He Phe Ala Gly Gly Ala Asn Asp Arg Trp Val 
300 305 310 

AGA GCT GTG TCT TTC ACT CAT GAT GGA CTG CAT GTT GCC AGC CTT GCT 1252 
Arg Ala Val Ser Phe Ser His Asp Gly Lea His Val Ala Ser Leu Ala 
315 320 325 3 30 

GAT GAT AAA ATG GTG AGC TTC TGG AGA ATC GAT GAG GAT TGT CCG GTA 1300 
Asp Asp Lys Met Val Arg Phe Trp Arg lie Asp Glu Asp Cys Pro Val 
335 340 345 

CAA GTT GCA CCT TTG AGC AAT GGT CTT TGC TGT GCC TTT TCT ACT GAT 134-8 
Gin Val Ala Pro Leu Ser Asn Gly Leu Cys Cys Ala Phe Ser Thr Asp 
350 355 360 

GCC AGT GTT TTA GCT GCT GGG ACA CAT GAT GGA AGT GTG TAT TTT TGG 1396 
Gly Ser Val Lea Ala Ala Gly Thr His Asp Gly Ser Val Tyr Phe Trp 
365 370 375 

GCC ACT CCA AGO CAA GTC CCT AGC CTT CAA CAT ATA TGT CGC ATG TCA 1444 
Ala Thr Pro Arg Gin Val Pro Ser Lew Gin His He Cys Arg Met Ser 
380 385 390 

ATC CGA AGA GTG ATG TCC ACC CAA GAA GTC CAA AAA CTG CCT GTT CCT 1492 
lie Arg Arg Val Hat Ser Thr Gin Glu Val Gin Lys Leu Pro Val Pro 
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395 400 405 410 

TCC AAA ATA. TTG GCG VST CTC TCC TAC CGC GGT TAG A CTGAAGACTG 1539 
Ser Lye lie Leu Ala pfae Leu Ser Tyr Arg Gly * 
415 430 

CCTTTCCTGG TAGGCCTGCC AGACAGAGCG CCCTTTACAA GACACACCTC AAGCTTTACC 1599 

TCGTGCCGAA TT 1611 



!2j INFORMATION FOR SEQ XD HO: 14; 

£13 SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 422 amino acids 

(B) TYPE: atnino acid 
{») TOPOLOGY j linear 

{.1.1} MOLECULE TYPE: protein 

(Xlj SEQUENCE DESCRIPTION: SEQ ID NOil4; 

Met Ala Ser Phe Pro Pro Arg Val Asn Qlv. Lye Glu lie val Arg Ser 
1 5 10 15 

Arg Thr He Gly Glu Leu Leu Ala Pro Ala Ala Pro Phe Asp Lys Lys 
20 25 30 

Cys Gly Gly Glu Asn Trp Thr Val Ala Phe Ala Pro Asp Gly Ser Tyr 
35 40 45 

Phe Ala Trp Ser Gin Gly Tyr Arg lie Val Lys Leu Val Pro Trp 3er 
50 55 60 

Gin Cys Arg Lys Asn Phe Leu Leu His Gly Ser Lys Asn Val Thr Asn 
65 70 75 80 

Ser Ser Cys Leu Lys Leu Ala Arg Gin Asn Ser Asn Gly Gly Gin Lys 
85 90 95 

Asn Lys Pro Pro Glu His Val lie Asp Cys Gly Asp lie Val Trp Ser 
100 105 110 

Leu Ala Phe Gly Ser Ser Val Pro Glu Lys Gin Ser Arg Cys Val Asn 
IIS 120 125 
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lle Glu Trp His Arg Pas Arg Phe Gly Gin Asp Gin Leu Leu Leu Ala 
130 13 5 140 

Thr Gly Leu Asn Asn Gly Arg il« Lys He Trp Asp Val Tyr Thr Gly 
145 150 155 160 

Lys Leu Leu Leu Asn Leu Val Asp His He Glu Met Val Arg Asp Leu 
165 170 175 

Thr Phe Ala Pro Asp Gly Ser Leu Leu Leu Val Ser Ala Ser Arg Asp 
180 185 190 

Lys Thr Leu Arg Val 'Trp Asp Leu Lys Asp Asp Gly Asn Met Val Lys 
195 200 205 

Val Leu Arg Ala His Gin Asn Trp Val Tyr Ser Cys Ala Phe Ser Pro 
210 215 220 

Asp Cys Ser Met Leu Cys Ser Val Gly Ala Ser Lys Ala Val Phe Leu 
225 230 235 240 

Trp Asn Kefc Asp Lys Tyr Thr Me:; He Arg Lys Leu Glu Gly His His 
245 250 255 

His Asp Val Val Ala Cys Asp Phe Ser Pro Asp Gly Ala Leu Leu Ala 
2S0 265 270 

Thr Ala Ser Tyr Asp Thr Arg Val Tyr Val Trp Asp Pro Bis Asn Gly 
275 280 285 

Asp Leu Leu Met Glu Phe Gly His Leu Phe Pro Ser Pro Thr Pro lie 
290 295 300 

Phe Ala Gly Gly Ala Asn Asp Arg Trp Val Arg Ala Val Ser Phe Ser 
305 310 315 320 

His Asp Gly Leu His Val Ala Ser Leu Ala Asp Asp Lys M«t Val Are? 

325 330 335 

Plie Trp Arg He Asp Glu Asp Cys Pro Val Gin Val Ala Pro Leu Ser 
340 345 350 

Asn Gly Leu Cys Cys Ala Phe Ser Thr Asp Gly Ser Val Leu Ala Ala 
355 360 365 
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Giy Thr His Asp Gly ssr Val Tyr Phe Trp Ala tfhr Pro Arg Gin Val 
370 375 380 

Pro Ser Leu Gin His; lie Cye Arg Met Ser lie Arg Arg Val Met Ser 
385 390 395 400 

Thr Gin Glu Val Gin Lys Leu Pro Val Pro Ser Lys lie Leu Ala Phe 
405 410 415 

Leu Ser Tyr Arg Gly * 
420 

(2j INFORMATION FOR SEQ ID KG; 15: 

f i ) SSQUKHCH CHARACTERISTICS : 

(A) LENGTH; 783 base pairs 

(B) TYPE; nucleic acid 
(CJ STRAS1BEDNESS ; single 
\D) TOPOLOGY ; linsar 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: IS ; 

CTGTCTTCCT CCGCAGCGCG AGGCTGGGTA CAGGGTCTAT TGTCTGTGGT TGACTCCGTA 60 

CTTTGGTCTG AGGCCTTCGG GAGCTTTCCC GAGGCAGTTA GCAGAAGCCG CAGCGACCGC 120 

CCCCGCCCGT CTCCTCTGTC CCTGGGCCCG GGAGACAAAC TTGGCGTCAC GCCCTCAGCG 180 

GTCGCCACTC TCTTCTCTGT TGTTGGGTCC GCATCGTATT CCCGGAATCA GACGGTGCCC 240 

CATAGATGGC GAGCTTTCCC CCGAGGGTCA ACGAGAAAGA GATCGTGAGA TC'ACG T ACTA 300 

TAGGTGAACT TTTAGCTCCT GCAGCTCCTT TTGACAAGAA ATGTGGTCGT GAAAATTGGA 360 

CTGTTGCTTT TGCTCCAGAT GGTTCATACT TTGCTTGGTC ACAAGGACAT CGCACAGTAA 420 

AGCTTGTTCC GTGGTCCCAG TGCCTTCAGA ACTTTCTCTT GCATGGCACC AAGAATGTTA 480 

CCAATTCAAG CAGTTTAAGA TTGCCAAGAC AAAATAGTGA TGGTGGTCAG AAAAATAAGC 540 

CTCGTGACAT ATTATAGACT GTGGAGATAT AGTCTGGAGT CTTGCTTTTG GGTCATCAGT 600 
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WXAGAAAAA CAGAGTCGCT GTGTAAATAT AGAATGGCA? CGCTTCAGRT TTCGACAAGA 660 

TCAGCTACTT CTTGCTACAG GGTTGAACAA TQGGCGTATC AAAATATGGG ATGTATATCA 720 

GGAAACTC C? CCTTAAOiTG GTAGATCATA CTGAAGTGGT CAGAGATTTA ACTTTTGCTC 780 

CAG 7 83 

{2} USFORKATIOBf FOR SBQ ID MO; 16: 

ii} SEQySSCS CHARACTERISTICS ; 

(A) LENGTH : 1122 base pairs 
!B) TYPE: nucleic acid 
!C) STfcANDKDNBSS : single 
!D) TOPOLOGY; linear 

<ii) MOLSCOLE TYSE: BNA 

ixi} SEQUENCE DESCRIPTION: S£Q ID NO: 16: 

CTCTGTATGT CTGAATGAAG CTATAACAT? TGCCTTTTTA TTGCAGGTTT TCCTTTGGAA 60 

TATGGATAAA TACACCATGA TACGGAAACT AGAAGGACAT CACCATGATG TGGTAGCTTG 120 

TGACTTTTCT CCTGATGGAG CATTACTGGC TACTGCATCT TATGATACTC QAGTATATAT 180 

CTGGGATCCA CATAATGGAG ACATTCTGAT GGAATTTGGG CACCTGTTTC CCCCACCTAC 240 

TCCAATATTT GCTGGAGGAG CAAATGACCG OTOSGTACGA TCTGTATCTT TTAGCCATGA 300 

TGGACTGCAT GTTGCAAGCC TTGCTGATGA TAAAATGGTG AGGTTCTGGA GAATTGATGA 360 

GGATTATCCA GTGCAAGTTG CACCTTTGAG CAATQGTCTT TGCTGTGCCT TCTCTACTOA 420 

TGGGA G TGTT TTAGCTGCTG GGACACATGA CG6AAGTGTG TATTTTTGGG CCACTCCACG 480 

GCAGGTCCCT AGCCTGCAAC ATTTATGTCG CATGTCAATC CGAAGAGTGA 7GCCCACCCA 540 

AGAAGTTCAG GAGCTGCCGA TTCCTTCCAA GCTTTTGGAG TTTCTCTOGT ATCGTATTTA 600 

GAAGATTCTG CCTTCCC'i'AG TAGTAGGGAC TQACAGAATA CACTTAACAC AAACCTCAAG 6 SO 

CTTTACTGAC TOCAATTATC TGTTTTTAAA GACGTAG&AG ATTTATTTAA TTTGATATGT 720 
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TTAAAATATT ATTTATAGAC AATAGAAGTA 780 

AAAGATCTAA CTGTGAAAAC ATACATACCT 840 

TGAATGGACC CTTTTGCTTT TCTGATTTT? 90O 

AGCCACAATA TGTATCTTTG CTGT&AAGTG 960 

TTAGATGGTA AATACTGACT TACGAAAGTT 1020 

GTCAGCAGTT TGAGACTAGC CTGGCAAACA 1080 

AAAAAAAAAA AA 11.22 



{2) INFORMATION FOR SEQ ID NOsI7: 

US SEQUENCE CHARACTERISTICS: 

(A) length: 2537 base* pairs 

(B) TYPE: nucleic acid 
EC) STEAHBEDJJESS: single 
(D) TOPOLOGY: linear 

iii! MOLECULE TYPE: DMA 



<:.x; FEATURE: 

!Ai HAKE /KEY ; CDS 

{Si LOCATIONS 422.. 2029 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CGGCACGAGC CGGGCTCCGT CCGGAGGAAG CGAGGCTGCG CCGCCGQCCC GGCAGGAGCG 60 

GAOGACGGGA GCGCGG6CGG TCGCGCTCGC CCTGTCGCTG ACTGCGCTGC CCCGGCCCAT 120 

CCTTGCCTGG CCOCAGGTGC CCTGGATGAG GCCGCCGCGC GTGTCCCGGC CGCTGAGTGT 180 

CCCCCGCGGT CGCCCGGCGC CTGCCCTCAA GCGGCCGCCT CTCCTTGCCC GGGTCCCCGT 240 

TTTCCCCCGG CGCAGTCCTC CTCCGGTGGG CGCCTCCGCA CCTCGGCGCA GGCGGCACGG 300 

CCCTCGGGCC GGGATGGATC CGCCGGGAAG AGGAAGACAA GCCGGGGCGT TGAGCCCCTG 350 
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CGCACGGTGC CGCCGCQCGT AGTGGGAGCT TACTCGCAGT AGGCTCTCGC TCTTCTAATC 

A ATG GAT AAA GTG GGG AAA ATG TGG AAC AAC TTA AAA TAC AGA TGC 
Met Asp Lys Val Gly i ; ys Met Trp Asn Asia Leu Lys Tyr Arg Cys 
13 10 15 

GAG AAT CTC TTC AGC CAG GAG GGA GGA AGO CGT AAT GAG AAC GTG GAG 514 
Gin Ash Leu Phe Sar His; Glu Gly Gly Ser Arg Asn Glu Ask Val Glu 
20 25 30 

ATG AAC CCC AAC AGA TGT CCG TCT GTC AAA GAG AAA AGC ATC AGT CTG 562 
Met: Asn Pro Asa Arg Cys Pro Ser Val Lys Glu Lys Ser lie Sar Leu 
35 40 45 

GGA GAG OCA GC'T CCC CAG CAA GAG AGO AGT CCC TTA AGA GAA AAT GTT S10 
Gly Glu Ala Ala Pro Gin Gin Glu Ser Ser Pro Leu Arg GIu Asn Val 
50 55 60 

GCC TTA CAG CTO GGA CTG AGC CCT TCC AAG ACC TXT TCC AGG CGG AAC 658 
Ala Leu Gin Leu Gly Leu Ser Pro Ser Lys Thr Phe Ser Arg Arg Asn 
65 70 75 

CAA AAC TOT GCC OCA GAG ATC CCT CAA GTG GTT GAA ATC AGC ATC GAG 706 
Gin Asn Cys Ala Ala Glu lie Pro Gin Val Val Glu He Ser He Glu 
SO 85 90 95 

AAA GAC AG'T GAC TCG GGT GCC ACC CCA GGA ACG AGG CTT GCA CGG AGA 754 
Lys; Asp Ser Asp Ser Gly Ala Thr Pro Gly Thr Arg Leu Ala Arg Arg 
100 105 no 

GAC TCC TAC TCG CGG CAC GCC CCG TGG GGA GGA AAG AAG AAA CAT TCC 802 
Asp Sar Tyr Ser Arg His Ala Pro Trp Gly Gly Lys Ly.g Lys His Ser 
US 120 125 

TGT TCC ACA AAG ACC CAG AGT TCA TTG GAT ACC GAG AAA AAG TTT GGT 850 
Cys Ser Thr Lys Thr GIrs Ser Ser Lea Asp Thr Glu Lys Lys Phe Gly 
130 135 140 

AGA ACT CGA AGC GGC CTT CAG AGG CGA GAG CGG CGC TAT GGA GTC AGC 898 
Arg Thr Arg Ser Gly Leu Gin Arg Arg Glu Arg Arg Tyr Gly Val Ser 
145 150 155 

TCC ATG CAG GAC ATG GAC AGC STT TCT AGC CGC GCG GTC GGC AGC CGC 946 
Ser Met; Gin Asp Met Asp Ser Val Ser Ser Arg Ala Val Gly Ser Arg 
160 165 .1.70 175 
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TCC CTG AGG CAG AGO CTC CAG GAC ACG GTS GGT TTG TGT TTT CCC ATG 994 
Ser Leu Arg Gin Arg Leu Gin Asp Thr Val Gly Leu Cys f?he Pro Met 
180 185 190 

AGA ACT TAG AGC AAG CAG TCA AAG CCA CTC TTT TCC AAT AAA AGA AAA I 042 

Arg Thr Tyr Ser Lys Gin Sex Lys Pro Leu Phe Ser ftsn Lys Arg Lys 
1S5 200 205 

ATA CAT CTT TCT GAA TTA ATG CTG GAG AAA TGC CCT TTT OCT GCT GGC 10 SO 

lie His Ley Ser GXu Leu Met Leu Glu Lys Cys Pro Pho Pro Ala Gly 
210 215 220 

TCG GAT TTA OCA CAA AAG TGG CAT TTG ATT AAA CAG CAT ACC GCC CCT 113 8 

Ser Asp Leu Ala Gin Lys Trp His Leu lie Lys Gin His Thr Ala Pro 
225 230 235 

GTG AGC CCA CAC TCA ACA TTT TTT GAT ACA TTT GAT CCA TCA CTG GTG 1185 
Val Ser Pro His Ser Thr Phe Phe Asp Thr Pha Asp Pro Ser Leu Val 
240 245 250 255 

TCT ACA GAA GAT GAA GAA GAT AGG CTT CGC GAG AGA AGA CGG CTT AGT 1234 
Ser Thr Glu Asp Glu Glu Asp Arg Leu Arg Glu Arg Arg Arg Leu Ser 
260 265 270 

ATC GAA GAA GGG GTG GAT CCC CCT CCC AAC GCA CAA ATA CAC ACC TTT 1282 
lie Glu Glu Gly Val Asp Pro Pro Pro Asn Ala Gin He His Thr Phe 
275 280 285 

GAA GCT ACT GCA CAG GTC AAC CCA TTG TAT AAG CTG GGA CCA AAG TTA 1330 
Glu Ala Thr Ala Gin Val Asn Pro Leu Tyr Lys Leu Gly Pro Lys Leu 
290 293 300 

GCT CCT GGG ATG ACA GAG ATA AGT GGA GAT GGT TCT GCA ATT CCA CAA 1378 
Ala Pro Gly Mot, Thr Glu lis Ser Gly Asp Gly Ser Ala lie Pro Gin 
3 05 310 315 

GCA ATT GTG ACT CAG AAG AGG ATT CAA CCA CCC TAT GTC TGC AGT CAC 1426 
Ala He Val Thr Gin Lys Arg lie Gin Pro Pro Tyr Val Cys Ser His 
320 325 330 335 

GGA GGC AGA AGC AGC GCC AGG TOT CCG GGG ACA GCC ACG CGC ACG TTA 1474 
Gly Gly Arg Ser Ser Ala Arg Cys Pro Gly Thr Ala Thr Arg Thr Leu 
340 345 350 

GCA GAC AGO GAG CTT GGA AAG TTC ATA CGC AGA TCG ATT ACA TAC ACT 1522 
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Ala Asp Arg slu Leu Gly Lys Phe lie Arg Arg Ser lie Thr Tyr Thr 
355 360 365 

GCC TCG TGC CAG ATT TGC TTC AGA TCA CAG GGA ATC CCT OTP ACT GGQ 1570 
Ala Ser Cys GIr lie Cys Phe Arg Ser Gin Gly He Pro Val Thr Gly 
3?0 375 380 

GCG TGA TGG ACC GAT ACG AGS CCG AAC CCC TTC TAG AAG GGA AAC CGG 1618 
Aia * Trp Thr Asp Thr Arg Pro Lys ?ro Phe * Lys Gly ksn Arg 
385 390 3S5 

AAG GCA CGT TCT TGC TCA GGG ACT CTG CAC AGG AGG ACT ACC TCT TCT 16S6 
Lys Ala Arg Ser Cys Ser Gly Thr Leu His Arg Arg Thr Thr Ser Ser 
400 405 415 415 

CTG TGA CCT TCC GCC GC'T ACA ACA GGT CTC TGC ACG CCC GGA TCG AGC 1714 
Leu * Ala Ser Ala Ala Thr Thr Gly Leu Cys Thr Pro Sly Ser Ser 
420 425 430 

AGT GGA ACC ACA ACT TCA GCT TCG ATG CCC ATG ACC CCT GCG TOT TTC 1762 
Ser Gly Thr Thr Thr Ser Ala Ser Met Pro Met Thr Pre Ala Cys Phe 
435 440 445 

ACT CCT CCA CGT CAC GGG GCT TC? CGA ACA CTA TAA AGA CCC CAG CTC 1810 
Thr Pre Pro Arg His Gly Ala Ser Arg Thr Leu * Arg Pro Gin Leu 
450 455 460 

TTC CAT GTT TTT TGA ACC GTT GCT AAC GAT ATC ACT GAA TAG AAC TTT 1858 
Leu His Val Phe * Thr Val Aia Asa Asp lie Thr Gin * Asn Phe 
465 470 475 

CCC TTT CAG CCT GCA GTA TAT CTG CCG CGC AGT GAT CTG CAG ATG CAC 1906 
Pro Phe Gin Pro Ala Val Tyr Leu Pro Arg Ser Asp Leu Gin Met. His 
480 4S5 490 495 

TAC GTA TGA TGG GAT TGA CGG GCT CCC GCT ACC GTC GAT GTT ACA GGA 1954 
Tyr Val * Trp Asp * Arg Ala Fro Ala Thr Val Asp Val Thr Gly 
500 505 510 

TTT TTT AAA AGA GTA TCA TTA TAA ACA AAA AGT TAG GGT TCG CTG GTT 2002 
Phe .Phe Lys Arg Val Ser Leu * Thr Lys Ser * Gly Ser Leu Val 
515 520 525 

AGA ACG AGA CCA GTC AAA GCA AAG TAACTCCTGT CCCCAAAGGG CACTAACTAA 2 OSS 

Arg Thr Arg Pro Val Lys Ala Lys 
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530 535 

GTCT-GCTCCT CCCGTGCATC GAACTGCACC CATAGGAGGC AGTCAGCTGC TAGGATTTCC 2116 

CACCCAGAAT GGGAGCTTAG TCAT'TAGCCT CTGCCCT&TG GQGTCCGCTG TTCCTCAGAC 2176 

AAAGGTGCCT AGGGACAGCA AGATGGCTTG CA3GTGTTCG GTGGGCTGTG ACAACTGAGG 2236 

GAGGCAACTC TGGGGCATTT GCTATGAAGA A7TCTATTTC TTACCGAAGA ACAAATTATT 2296 

AATA'JPTSSAT GGGTATTTCA ATAGTGTGAC TAATGTTTGA AATTATTTTT TCTAAGAATT 2356 

TTTCTATAAC CTTCAGAAAA AGTAGTGATG TTTGTAGTTA CTATAAATCA AGCTTTGAAA 2416 

OTTCAAAACR AACAAGTTAA ATAAAAGACT ACCTTCCTTT TAGAGAAAAC AAATGCAAGT 2476 

TTTCCCAGCC ACAGGCATTG TGCACTGTTA ATGTTGCTTG IKPATCAGCTC CTTTGTCCTC 253 S 

C 2537 

{2} INFORMATION FOR S.SQ ID NO: 18: 

U) SSQOSKCB CSARACTERISflCS: 

(A) LENGTH: 535 amino acids 

(B) TYPE; amino acid 
!D) TOPOLOGY: linear 

•I .11} MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ 10 ETC): 18: 

Met Asp Lys val Gly Lys Met Trp Asn Asn Leu Lys Tyr Arg Cys Gin 
15 10 IS 

Asn Leu ?he Ser His Glu Gly Gly Ser Arg Asa GIu Asn Val Glu Met 
20 25 30 

Asn Pro Asn Arg Cys Pro Ser Val Lys Glu Lys Ser lie Ser Leu Gly 
35 40 45 

Glu Ala Aia Pro Gin Gin Glu Ser Ser Pro Leu Arg Glu Asn Val Ala 
50 55 60 

Leu Gin Leu Gly Leu Ser Pro Ser Lys Thr Phe Ser Arg Arg Asn Gin 
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65 70 75 SO 

Asn Cys Ala Ala Glu lie Pro Gin Val Val Glu lie Ser lie Glu Lys 
85 90 95 

Asp ser Asp Ser Gly Ala ttor pro Gly Thr Arg Leu Ala Arg Arg Asp 
100 105 1X0 

Ser Tyr Ser Arg Kis Ala Pro Trp Gly Gly lys Lys Lys His Ser Cys 
115 120 12 5 

Ser Thr Lys Thr Gin Ser Ser Leu Asp Thr Glu Lys Lys Phe Gly Arg 
130 135 140 

Thr Arg Ser Gly Leu Gin Arg Arg Glu Arg Arg Tyr Gly Val Ser Ser 
145 15G 155 160 

Met Gin Asp Mat Asp Ser Val Ser Ser Arg Ala Val Gly Ser Arg Ser 
165 170 175 

Leu Arg Gin Arg Leu Gin Asp Thr Val Gly Leu Cys Phe Pro Met Arg 
180 185 190 

Thr Tyr Ser Lys Gin Ser Lys Pro Leu She Ssr Asn Lys Arg Lys lie 
195 200 205 

His Leu Ser Glu Leu Met Leu Glu Lys Cys Pro Phe Pro Ala Gly Ser 
210 215 220 

Asp Leu Ala Gin Lys Trp His Leu lie Lys Sirs Kis Thr Ala Pro Val 
225 230 235 240 

Ser Pre His Ser Thr Phe Pha Asp Thr Phe Asp Pro Ser Leu Val Ser 
24,5 250 255 

Thr Glu Asp Glu Glu Asp Arg Leu Arg Glu Arg Arg Arg Leu Ser lie 
260 265 270 

Glu Glu Gly Val Asp Pro Pro Pro Asn Ale Gin He His Thr Phe Glu 
275 280 285 

Ala Thr Ala Gin Val Asn Pro Leu Tyr Lys Leu Gly Pro Lys Leu Ala 
290 295 300 

Pro Gly Met Thr Glu lie Ser Gly Asp Gly Ser Ala lie Pro Gin Ala 
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305 310 325 320 

lie val Thr Gin Lys Arg lie Gin Pro Pro Tyr Val Cys Ser His Gly 
325 330 335 

Gly Arg Ser Ser Ala Arg Cys Pro Gly Thr Ala Thr Arg Thr Leu Ala 
340 345 350 

Asp Arg GIu Leu Gly Lye Phe lie Arg Arg Ser lie Thr Tyr Thr his 
355 360 365 

Ser Cys Gin lie Cys Phe Arg Ser Ola Gly He Pro Val Thr Gly Ala 
370 375 380 

* Trp Thr Asp Thr Arg Pro Lys Pro Phe * Lys Gly Asn Arg Lys 
385 390 3S5 400 

Ala Arg Ser Cys Ser Gly Thr Leu His Arg Arg Thr Thr Ser Ser l.,oxi 
405 410 415 

* Ala Ser Ala Ala Thr Thr Gly Leu Cys Thr Pro Gly Ser Ser Ser 

420 425 430 

Gly Thr Thr Thr Ser Ala Ser Kefc Pro Met Thr Pro Ala Cys Phe Thr 
435 440 445 

Pro Pro Arg His Gly Ala Ser Arg Thr Lex; * Arg Pro Gin Leu Leu 
450 455 460 

His Val Phe * Thr Val Ala Asa Asp He Thr Glu * Asn Phe Pro 
465 470 475 .180 

Phe Gin Pro Ala Val Tyr Leu Pro Arg Ser Asp Leu Gla Met His Tyr 
485 490 4.95 

Val * Trp Asp * Arg Ala Fro Ala Thr Val Asp Val Thr Gly Phe 
500 505 510 

Phe Lys Arg Val Ser Leu * Thr Lys Sex * Gly Ser Leu Val Arg 
515 520 525 

Thr Arg Pro Val Lys Ala Lys 
530 535 
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{2 S INFORMATION FOR SEQ ID NO: 19s 

U) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 1221 base pairs 
(8) TYPE: nucleic acid 
<C> STKANB3DKESS : single 
£D) TOPOLOGY ; linear 

iii) MOLECULE TYPE: DMA 



<xi) SEQUENCE- DESCRIPTION: SEQ 10 NO: 19: 

GATTAAACAG CATACAGCTC CTGTGAOCCO ACATTCAACA TTTTTTGATA CTTTGATCCA 60 

TCTTTGGTTT CTACAGAAGA TGAAGAAG&T AGGCTTAGAG AGAGAAGGCG GCTTAGTATT 120 

GAAGAAGGGG TTGATCCCCC TCCCAATGCA CAAATACATA CATTTGAAGC TACTQCACAO 180 

GTTAATCCAT TATTAAACTG 3GACCAAAAT TAGCTCCTGG AATGACTGAA ATAAGTGGGG 240 

ACAGTTCTGC AATTCCACAA GCTAATTGTG ACTCGGAAGA GGATACAACC ACCCTGTGTT 300 

GCAGTCACGG AGGCAGAAGC AGCGTCAGAT ATCTGGAGAC AGCCATACCC ATGTTAGCAG 360 

ACAGGGAGCT TGGAAAGTCC ACACACAGAT TGATTACATA CACTGCTTCG TGCCTGATTT 420 

GCTTCAAATT ACA366AATC CCTGTTACTG GGGAGTGATG GACCGTTATG AAGCAGAAGC 4 80 

CCTTCTCGAA GGGAAACCTG AAGGCACGTT TTT3CTCAGG GACTCTGCGC AAGAGGACTA 540 

CTTCTTCTCT GTGAGCTTCC GCCGATACAA CAGATCCCTG CATGCCCGAA TTGAGCAGTG 600 

GAATCACAAC TTTAGTTTCG ACGCCCATGA CCCGTGTGTA TTTCACTCCT CCACTGTAAC S60 

GGGACTTTTA GAACATTATA AAGATCCCAG TTCGTGCATG TTTTTTGAAC CATTGCTTAC 720 

TATATCACTA AATAGGACTT TCCCTTTTAG CCTGCAGTAT ATCTGTCGCG CGGTAATCTG 780 

CAGGTGCACT ACGTATGATG GAATTGATGG GCTCCCTCTA CCCTCAATGT TACAGGATTT 84 0 

TTTAAAAGAG TATCATTATA AACAAAAAGT TAGAGTTCGC TGGTTGGAAC GAGAACCAGT 900 

CAAGGCAAAG TAAACTCTCC GGTCCCCAAA OGGTGTTAAC TAGGTCCGCT TTCATGTGCA 950 
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TCAGAC AGT A CACCTATAGC AAGCACACGT AOC&GTOTTA GGCTTTTTCA TACAGTATGT 1020 

AAGCTTAGTG TTAGTATCTG TCAGATSCTA CCTGCTGTTA CTTATTCAGA TAAACATGGT 1080 

GCCTATTGGA ACAATAGCGG ATAGAGCTAC AGGTGTTCAG TAAGACTACA AAAACATTTT 1140 

GCCTATT TC G CTAACAGTTT GGTTTTTAAT GGCTGTGGTA TTTGAGTGAG GCAACTCTGG '.1200 

GGCATTTGTT ATGAAGAAAT G 1221 



(2) INFORMATION FOR S30 ID NO; 20: 

(IS SKQUBKCE CHARACTERISTICS: 

(A) LENGTH: 2369 base pairs 
(Bi TYP3; nucleic acid 
(C) STRANDEBNESS; single 
5D) topology; linear 

(ii) MOLECULE TYPE: OKA 

(ix'i FEATURE: 

(A) NAME /KEY ; CDS 

SB; LOCATION: lib.. 1330 



(XX) SEQUENCE DESCRIPTION: SEQ XD BO; 20: 

GGCACGAGGC GGTGGTGGCG GCGGCGGGCG CGGCCGCGGC GGGGCGGGCG CGGAATGAAG 60 

GCCCACGGCC CTGGGGGCTG AGGCGCCCGC CGCCTGGGGC OGWCOGCQCG TCCTC ATG 118 

Met 
1 

GAG GCC GGA GAG GAG CCG CTG CTG CTG OCT GAA CTG AAG CCT GGG CGC 166 
Glu Ala Gly Glu Glu Pro Leu Leu Leu Ala Glu Leu Lys Pro cly Arg 
5 10 15 

CCC CAC CAG TTC GAG TGG AAG TCA AGO TGC GAG ACC TOG AGC GTG GCC 214 
Pro His Gin Phe Asp Trp Lys Ser Ssr Cys Glu Thr Trp Ser Val Ala 
2 0 2S 30 

TIC TCG CCA GAG GGT TCC TGG TTC GCC TGG TC'T CAA GGA CAC TGC GTG 26? 
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?he Ser Pro Asp Gly Ser Trp Pbe Ala Trp Ser Gin Gly His Cys Val 
35 40 45 

GTC AAG CTG GTC CCC TGG CCC TTA GAG GAA CAG TTC ATC CCT AAA GGA 310 
Val Lys Leu Val Pro Trp Pro Leu Glu Glu Gin Phe He Pro Lys Gly 
SO 55 60 65 

W GAA GCC AAG AGC CGA AGC AGO AAG AAT GAC CCA AAA GGA CGG GGC 358 
Phe Giu Ala Lys Ser Arg ser Ser Lys Asn Asp Pro Lys Gly Arg Gly 
70 75 SO 

AST CTG AAG GAG AAG ACQ CTG GAC TGT GGC GAG ATT GTG TOG GGG CTG 406 
Ser Leu Lys Glu Lys Thr Leu Asp Cys Gly Gin lie Val Trp Gly Ley 
85 90 95 

GCC TTC AGC CGG TOO CCC TCT CCA CCC AGC AGO AAA CTC TOO OCA CGG 454 
Ala Phe Ser Pro Trp Pro Ser Pro Pro Ser Arg Lys Leu Trp Ala Arg 
100 105 110 

CAC CAT CCC CAG GCG CCT GAT GTT TCT TGC CTG ATC CTG GCC ACA GGT 502 
His His Pro Gin Aia Pro Asp Val Ser Cys Lett He Leu Ala Thr Gly 
115 120 125 

CTC AAC GAT GGG CAG ATC AAG ATT TOG GAG OTA CAG ACA GGC CTC CTG 550 
Leu Asn Asp Gly Gin He Lys lie Trp Glu Val Gin Thr Gly Leu Leu 
130 135 140 145 

CUT CTG AAT CTT TCT GGC CAC CAA GAC GTC GTG AGA GAT CTG AGC TTC 598 
Leu Leu Asn Leu Ser Gly His Gin Asp Val Val Arg Asp Leu Ser Phe 
ISO 155 160 

ACG CCC AGC GGC ACT TTG ATT TTG GTC TCT GCA TCC CGG GAT AAG ACA 646 
Thr Pro Ser Gly Ser Lea He Leu Val Ser Ala Ser Arg Asp Lys Thr 
165 179 17:5 

CTT CGA ATT TOO GAC CTG AAT AAA GAC GOT AAG CAG ATC CAG GTG TTA 694 
Leu Arg lie Trp Asp Leu Asn Lys His Gly Lys Gin He Gin Val Leu 
180 185 190 

TCC GGC CAT CTG CAG TGG GTT TAC TGC TGC TCC ATC TCC CCT GAC TGT 742 
Ser Gly His Leu Gin Trp Val Tyr Cys Cys Ser lie Ser Pro Asp Cys 
195 200 205 

AGC ATG CTG TGC TCT GCA GCT GGG GAG AAG TCG GTC TTT CTG TGG &<3C 7 §q 
Ser Mcs Leu Cys Ser Ala Ala Gly Glu Lys Ser Val Phe Leu Trp Ser 
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210 215 220 225 

ATS CGG TCC TAC ACA CTA ATC CGG AAA CTA GAA QGC CAC CAA AOC AG™ 83 S 

Met. Arg Bcr Tyr Thr Lei: He Arg Lys Leu GX« Gly His Gin Ser Ser 
230 235 240 

GTT GTC TCC TGT GAT T'TC TCT CCT GAT TCA GCC TTG CTT GTC ACA GCT 886 
Val Val Ser Cys Asp Pha Ser Pro Asp Ser Ala Leu Leu Val Thr Ala 
245 250 255 

TCC: TAT GAC ACC AG™ GTG ATT ATG TG6 GAC CCC TAC ACC GGC OCG AGO 934 
Ser Tyr Asp Thr Ser Val lis Met Trp Asp Pro Tyr Thr Gly Ala Arg 
260 265 270 

CTG AGG TCA CTT CAT CAC ACA CAA CTT GAA CCC ACC ATG GAT GAC AGT 932 
Leu Arg Ser Leu His His Thr Gin Leu Glu Pro Thr Met Asp Asp Ser 
275 280 285 

GAC GTC CAC ATG AGC TCC CTG AGG TCC GTG TGC TTC TCA CCT GAA GOC 1030 
Asp Val His Met Ser Ser Leu Arg Ser Val Cys Phe Ser Pro Glu Gly 
290 295 300 305 

TTG TAT CTC GCT ACG GTG CCA GAT GAC AGG CTG CTC AGG ATC TOO GCT 1078 
lasso Tyr Levi Ala Thr Val Ala Asp Asp Arg Leu Leu Arg lie Trp Ala 
310 315 320 

CTG GAA CTG AAG GCT CCG GTT GCC TTT GCT CCG ATG ACC AAT GGT CTT 1126 
Leu Glu Leu Lys Ala Pro Val Ala Phe Ala Pro Met Thr Asn Gly Leu 
325 330 335 

TGC TGC ACG TTC TTC CCA CAC GGT GGA ATT ATT GCC ACA GGG ACG AG A 1174 
Cys Cys Thr Phe Phe Pro His Gly Gly lie He Ala Thr Gly Thr Arg 
340 345 350 

GAT GGC CAT GTC CAS TTC TGG ACA GCT CCC CGG GTC CTG TCC TCA CTG 1222 
Asp Gly His Val Gin Phe Trp Thr Ala Pro Arg Val Leu Ser Ser Leu 
355 360 365 

ARG CAC TTA TGC AGG AAA GCC CTC CGA AGT TTC CTG ACA ACG TAT CAA 12 70 

Lys Kis Leu Cys Arg Lys Ala Leu Arg Ser Phe Leu Thr Thr Tyr Gin 
370 375 380 385 

GTC CTA GCA CTG CCA ATC CCC AAG AAG ATG AAA GAG TTC CTC ACA TAC 1318 
Val Leu Ala Leu Pro lie Pro Lys Lys Met Lys Glu Phe Leu Thr Tyr 
390 395 400 
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AGG ACT TTC TAGCAGTGCC GGCTCCCCCA CCTCCT6CAG CAGCAGCAGT 13 57 
Arg Thr Phe 

403 

ACAAGGGACT GGCTAGGATG GAGTCAGGCA GCTCACACTG GACCAGTGTG GACCTTCCTT 1427 

CCTCCCATGG CATCTGCAAG TAGGTCTGCG TGACCCCACT TCTGTGGTGC CGGCCTTACC 1487 

TCGTCTTCAT CCGTGGTGAG CAGCCTTCGT CAGTCTAGTT GTGTTGAAGC CAAGTGCAGT 1547 

TGTQGATGT'l* GCTGGGGTAA TAAAGGCAAG CGGGCTCCAG AGCCTCTCTG GTGGCGGCCA 1607 

AGCCACACTC CCTTAACTGG GAAGTACCTG CCACG7AGGG CATTTCTGCT GCCTATTTCC 1667 

AGCCAGCGGC TGCATGGTTT GAAGTTCCTC CGTTGTGGTC AGAAGAAC'TC TGGTOTTTGG 1727 

TTCCCTGCTC ASCTGCGCGT GGACTGGGCT GAGCTCCTCA CCATACACTA GTGCCGGCTT 1737 

TTGTTTCCTG TAAACAGTGG TTGCATGTGT AGAGAAGTAA CAAGCGAGTA TTCAGATCAT 1847 

ACGAGGAGGC GTTCCTCGGT GCATGACGGT CAGATGGCCA TTTATCAGCA TATTTATTTG 1907 

TATTTTCTCA GCACATAGTA AGGTACAACT GTGTTTTCTC AATTGTCTCG AAAAAACAGA 1967 

GTPCTTAAGT GGCCCAGTTG TGGAGCCAAG TCTAAGTCGT GTGGAGTCAG TGCTGACATC 2027 

ACTGGCTTGT GCTGTCTGTC ACATGTGTTT GTCTCTGCTG C TTGAC C TC A TGGGATGTAC 2087 

CCTCCAGTTC AACTGCCCAA AACAGACAGC CCCTTCCAAG CACC'QT'TCTT TGACAGCGGT 2147 

AGCAQCTACC TATTCAAGAC GCCTCACACA AAATCTGCCT TAGAAAGTTA ATATATTTTA 2207 

AATTATTTTA AAAGAAACTC AACATCTTAT TCTTTGGCCT TTCTTAATTG ATGCTTTATG 2267 

GAGGCAGTG7 TAACATTGTA CAGTGTATGC ATAGAGGAGT CTCCTCTATT TGAAGAACAA 2327 

TGCAAAATGA GGCTTTC ATT GAAGGGAAAA AAAAAAAAAA AA 2369 

(2} INFORMATION FOR BEQ ID SO$21s 

il) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 404 amino acids 

(B) TYPE; amino acid 
<£>; TOPOLOGY: linear 

SUBSTITUTE SHEET (RULE 26} 



WO 98/20023 



PCT/AU9XTO729 



- 139- 

(ii) MOLECULE TYPE : protein 

Cxi) SEQUENCE DESCRIPTION : SEQ IB MO; 21; 

Met Glu Ala Gly Glu Glu Pro Leu Leu lieu Ala GXu Leu Lys Pro Gly 
1 5 10 15 

Arg Pro His Gin Phe Asp Trp Lys Ser Ser Cys Glu Thr Trp Ser Vsl 
20 25 30 

Ala Hi© Ser Pro Asp Gly Ser Trp Phe Ala Trp Ser Gin Gly His Cys 
35 40 45 

Val Val Lys Leu Val Fro Trp Fro Leu Glu Glu Gin Phe lie Pro Lys 
50 55 60 

Gly Phe Glu Ala Lys Ser Arg Ser Ser Lys Asa Asp Pro Lys Giy Arg 
65 70 75 80 

Giy Ser Leu Lys Glu Lys Thr Leu Asp Cys Gly Gin lie Val Trp Gly 
85 90 95 

Leu Ala Phe Ser Pro Trp Pro Ser Pro Pro Ser Arg Lys Leu Trp Ala 
100 105 110 

Arg His His Pro Gin Ala. Pro Asp Val Ser Cys Leu lie Leu Ala Thr 
115 120 125 

Gly Leu Asn Asp Gly Gin lie Lys lie Trp Glu Val Gin Thr Gly Leu 
130 135 140 

Leu Leu Leu Asn Leu ser Gly His Glu Asp Val Val Arg Asp Leu Ser 
145 150 155 160 

Phe Thr Pro Ser Gly Ser Leu lie Lea Val Ser Ala Ser Arg Asp Lys 
165 170 175 

Thr Leu Arg lie Trp Asp Leu Asn Lys His- Gly Lys Gin lie Gin Val 
180 185 190 

Leu Ser Gly His Leu Gin Trp Val Tyr Cys Cys Ser He Ser Pro Asp 
195 200 205 

Cys; Ser Met Leu Cys Ser Ala Ala Gly Glu Lys Ser Val Phe Leu Trp 
210 215 220 
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Ser Met Arg Set: Tyr Thr Leu lis hXQ Lys Leu Olis Gly His Gin Ser 
225 230 235 240 

Ser Val Val Ser Cys Asp Phe ser Pro Asp Ser Ala Leu Leu Val Thr 
245 250 255 

Ala Ser Tyr Asp Thr Ser Val lie Met Trp Asp Pro Tyr Thr Gly Ala 
260 265 270 

Arg Leu Arg Ser Leu His His Thr Gin L<>u Glu Pro Thr Met Asp Asp 
275 280 285 

Ser Asp Val His Met Ser Ser Leu Arg Ser Val Cys Phe Ssr Pro Glu 
290 295 3 00 

Gly L<ei! Tyr Leu Ala Thr Val Ala Asp Asp Arc? Leu Leu Arg lie Trp 
305 310 315 320 

Ala Leu Glu Leu Lys Ala l J ro Val Ala Phe Ala Pro Met Thr Asn Gly 
325 330 335 

Lea Cys Cys Thr Phe Phe Pro His Gly Gly He He Ala Thr Gly Thr 
340 345 350 

Arg Asp Gly His Val Gin Phe Trp Thr Ala Pro Arg Val Leu Ser Ser 
355 360 385 

Leu Lys His Leu Cys Arg Lys Ala Leu Arg Ser Phe .Leu Thr Thr Tyr 
370 375 ' 380 

Gin Val Leu Ala Leu Pro lie Pro Lys Lys Met Lys Glu Phe Leu Thr 
385 390 395 400 

Tyr Arg Thr Phe 



(2) INFORMATION FOR SEQ ID NO: 22; 

ii} SEQUENCE CHARACTERISTICS: 

{A] LENGTH: 1246 base pairs 
(B) TYPE: nucleic acid 
(C! STRANDBDMBSS ; single 
(D) TOPOLOGY: linear 
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£ i 1 > MOLECULE TYPE: DNA 

{xi) SEQUENCE DESCRIPTION; SSQ ID MO: 22: 

GACACTGCAT CGTCAAACTG ATCCCCTGGC CGTTGGAGGA GCAGTTCATC CCTAAAGGG? 60 

TTSAAGCCAA AAGCCGAAGT AGCAAAAATG AGACGAAAGG GCGGGGCAGC CCAAAAGAGA 12 0 

AGACGCTGGA CTGTGGTCAG ATTGTCTGGG GGCTGGCCTT CAGCCTGTGC TTTCCCCACC 180 

CAGCAGGAAG CTCTGGGCAC GCCACCACCC CCAAGTGCCC GATGTCTCTT GCCTGGTTCT 240 

TGCTACGGGA CYCAACGATG GGCAGATCAA GATCTGGGAG GTGCAGACAG GGCTCCTGCT 300 

TTTGAATCTT TCCGGCCACC AAGA'iGTCGT GAGAGATCTG AGCTTCACAC CCAGTGGCAG 3 SO 

TTTGATTTTG GTCTCCGCG? CACGGGATAA GACTCTTCGC ATCTGGGACC TGAATAAACA 420 

CGGTAAAC&G ATTCAAGTGT TATCGGGCCA CCTGCAGTGG GTTTACTGCT GTTCCATCTC 480 

CCCAGACTGC AGCATGCTGT GCTCTGCAGC TGGAGAGAAG TCGGTCTTOC TATGGAGCAT 540 

GAGGTCCTAC ACGTTAATTC GGAAGCTAOA GGGCCATCAA AGCAGTGTTG TCTCTTGTGA 600 

CTTCTCCCCC GACTCTGCCC TGCTTGTCAC GGCTTCTTAC GATACCAATG TGATTATGTG 660 

GGACCCCTAC AC CGGCG AAA GGCTGAGGTC ACTCCACCAC ACCCAGGTTG AGCCCGCCAT 720 

GGATGACAGT GACGTCCACA TTAGCT'CACT GAGATCTGTG TGCTTCTCTC CAGAAGGCTT 780 

GTACCTTGCC ACGGTGGCAG ATGACAGACT CCTCAGGATC TGGGCCCTGG AACTGAAAAC 840 

TCCC ATTGC A TTTGCTCCTA TGACCAATGG GCTTTGCTGG CACATTTTTT CCACATGGTG 900 

GAGTCATTGC CACAGGGACA AGAGATGGCC ACGTCCAGTT CTGGACAGCT CCTAGGGTCC 960 

TGTCCTCACT GAAGCACTTA TGCCGGAAAG CCCT7CGAAG TTTCCTAACA ACTTACCAAG 1020 

TCC'FAGCAC? GCCAATCCCC AAGAAAATGA AAGAGTTCCT CACATACAGG ACTTTTTAAG 1080 

CAACACCACA TCTTGTGCTT C TTTGTAGC A GGGTAAATCG TCCTGTCAAA GGGAGTTGCT 1140 

GGAATAATGG GCCAAACATC TGGTCTTGCA TTGAAATAGC ATTTCTTTGG GATTGTGAAT 1200 
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AGAATGTAGC AAAACCAGA? TCCAGTGTAC TAGTCATGOA TTTTTC 1246 

(2} INFORMATION FOR S2Q X3> WO: 23: 
(ii SEQUENCE CHARACTERISTICS i 

<A) lskgth: 422 base pairs 

tB) TYPE : nucleic acid 

<C} STRAND EDMESS : single 

(Ei TOPOLOGY: linear 

(ii} MOLECULE TYPE : DNA 

<xii SEQUENCE 3BSCRIFTXOB: SEQ ID NOt23; 

ACCATGG7TC CAAGTCCTCT CCCCTGTGGT CAAOTTGCCC GAATGT'TGGG CCCAAGTOCC 60 

TTTTCCTCCT TGGGCCTCCC CTTCTGACCT GCAGGACAGT TTTCCGGAGC CCATTTGGTA 120 

TGAGGTATTA ATTAGCCTTA ACTAAATTAC AGGGGACTCA GAGGCCGTGC TCCTGACCGA 180 

TCCAGACACT A1TTTTTTTT TTTTTTTTTA ACAATGGTGT GCATGTGCAG GAAATGACAA 240 

ATTTGTATGT CAGATTATAC AAGGATGTAT TCTTAAACCG CATGACTATT CAGATGGCTA 300 

CTGAGTTATC AGTGGCCATT TATTAGCATC ATATTTATTT GTATTTTCTC AACAGATGTT 3 60 

AAGGTACAAC TGTGTTTTTC TCGATTATCT AAAAACCATA GTACTTAAAT TGAAAAAAAA 420 

AA 422 



(2) INFORMATION FOR SBO ID UO;24i 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2019 base pairs 
IB) TYPE: nucleic acid 
(C) STRAHBEDSIESS : single 
•B} TOPOLOGY: linear 



(ii) MOLECULE TYPE : DKA 
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(Ki) SEQUENCE DESCRIPTIONS SBQ ID NO: 24: 

GGCACGAGGC GGGGTCA6GG CGGAGGCT6A GGACCAAGTA GGCATGGCGG AGGGCGGGAC 60 

CGGCCCCGAT GGACGGGCCG OCCCGGGACC CGCAGG'TCCT AATCTGAAGG AGTGGCTGAG 120 

GGAGCAGTTC TGTGACCATC CACTGGAGCA CfG'TGACGAT ACAAGACTCC ATGATGCAGC 180 

CTATGTAGGG GACCTCCAGA CCCTCAGGAA CCTACTGCAA GAGGAGAGCT ACCGGAGCCG 340 

CATCAATGAG AAGTCTGTC T GGTGCTGCGG CTGGCTTCCC TGCACACCAC TGAGGATCGC 300 

AGCCACTGCA GGCCATGGGA ACTGTGTGGA CTTCCTCA'IA CGCAAAGGGG CCGAGG7GGA 360 

CCTGGTGGAT GTCAAG3GGC AGACTGCCCT GTATGTGGCT GTAGTGAACG GGCACTTGGA 420 

GAGCACTGAG ATCCTTTTGG AAGCTGGTGC TGATCCCAAC GGCAGCCGGC ACCACCGCAG 480 

CACTCCTGTG TACCATGCCT YTCGTGTGGG TAGGGACGAC ATCCTGAAGG CTCXTATCAG 540 

GTATGGGGCA GATGTTGATG TCAACCATCA TCTGAATTCT GACACCCGGC CCCCTTTTTC 600 

ACGGCGGCTA ACCTCCTTGG TGGTCTOTCC TCTATACATC AGTGCTGCCT ACCATAACCT 660 

TCAGTGCTTC AGGCTGCTCT TGCAGGCTGG GGCAAATCCT GACTTCAATT GCAATGGCCC 720 

TGTCAACACC CAGGAGTTCP ACAGGGGATC CCCTG6GTGT GTCATGGATG CTGTCCTGCG 780 

CCATGGCTGT GAAGCAGCCT TCGTGAGTCT GTTGGTAGAG TTTGGAGCCA ACCTGAACCT 840 

GGTGAAGTGG GAATCCCTGG GCCCAGAGGC AAGAGGCAGA AGAAAGATGG ATCCTGAGGC 500 

CTTGCAGGTC TTTAAAGAGG CCAGAAGTA? TCCCAGGACC TTGCTGAGTT TGTGCCGGGT 96(3 

GGCTGTGAGA AGAGCTCTTG GCAAATACCG ACTGCATCTG GTTCCCTCGC TGCCGCTGGC 1020 

AGACCCCATA AAGAAGTTTT TGCTTTATGA GTAGCATTCA CATGCAGTGC TGACTGCAAT 1080 

GTGGAAGCCG ATCACCTGCA GTGAAAACTG ACACAGACTC TGGCATCCTG GGAACCATGG 1140 

CCTC'TGCTGC CAGCTTGATC CTTGGCTGTC AGTG&AGAAA AAACGGCTGT GTTCTCTTGG 1200 

ACTOTGATTC TATCTCAGG1 1 GCTTGG3CCA TCGAACGCTC CTTGAGTCAT TGTCAACTGA 1260 
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GAGGCACATA CAAAC'TTAAT TTTGTTCCTC 7TCAGTCTCT CTGtTTTGGA TTCTTCCTGG 132 0 

CAATGTGTGC AGCATGGGCT GAGCCTGGTG ATTGCCCTAG TGGGGAAGGC TTTTTTCTCC 1380 

AG G CTATGCA TCTATTTATG TTCCTACTTT GCAATT-mTT GTTCTTTTAA QGCTTGATAT 1440 

CAAAACAGAA AGAGGTTTGT TAAGAAAAGA TATAGGQAGA AAGGAATTCC GGTTCCGTGC IS CO 

AC T'i'GC T AGC CTGCTTTCCT TGCCTGGGTT TGTCTGTCFA TGCTGCCTGG TGCACATCCC 1560 

TTCTCTTTGC TGCCACTGTT CTATTTTGGG AGTOGTCTTC CGTCTAASAT GGCTTCTGGG IS 20 

GTTCTATCTT ATTGCACAGA GGTCCCAGAA CAGTGTTCAT AGGQCACCAT CTGCTCTGCC 1680 

AAGCKSTTTTC TC5&TGTCTTA CCCTGGGGAT CTTCAGACAG TGGTTACCTT TAGGAGACCC 1?40 

ACCTGGAACT AACCATTAAG -TGACTGCCCA CATTCAGATC AGGGACCATC TTAATAGTAC 1800 

TCACTGCCAG TCCTCACAAG AGAAGATGAC ACGGGTGCTC TCTTCAGACA CTCCCATACA 1860 

GCAAGTTCK3A AAATGTCTTG GTCACCTGGG TTGTTCCCAG GCTACAACTT CTTGGTGTTC 1S20 

CAC7AARACC AGRATATCCT AGTTTTTTGG GTi'GACTGTT CCCTCCCXAC TKPCCTTGAA 1980 

NCCCAATGCC CNTTTGTKTN «3TTGCT , fCC CTAAAAKTT 2019 



{2} INPORJSATION FOR SSQ 10 SO ; 2 5 : 

(ii SEQUENCE CHMACTBRISTICS s 

(A) IjEKGTH ; 350 amino acids 
(B> TYPE; amino acid 
(C! STRAMDEDSESS ; single 
(D? TOPOLOGY: linear 

Ui) MOLECULE TYPE: DNA 

ixi) SEOUEHCB DESCRIPTION : SSQ ID MO; 25: 

Ala Arg Gly Gly Val Arg Ala Glu Ala Olu Asp Gin Val Gly Sfet Ala 
1 5 10 IS 
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Glu Gly Gly Thr Giy Pro Asp Gly Arg Ala Gly Pro Gly Pro Ala Gly 
20 25 30 

Pro Asn Leu r,ys Glu Trp Leu Arg Glu Glu Fhe Cys Asp His Pro Leu 
35 40 45 

Glu. His Cys Asp Asp Thr Arg Leu His Asp Ala Ala Tyr Val Gly Asp 
50 55 SO 

Leu Gin Thr Leu Arg Asn Leu Leu Gla Glu Glu Sex Tyr Arg Ser Arg 
65 70 75 30 

lie Asn Glu Lys Ser Val Trp Cys Cys Gly Trp Leu Pro Cys Thr Pro 
85 90 95 

Leu Arg He Ala Ala Thr Ala Gly His Gly Asn Cys Val Asp Phe Leu 
100 105 HO 

lie Arg Lys Gly Ala Glu Val Asp Leu Val Asp Val Lys Gly Gin Thr 
115 120 125 

Ala Leu Tyr Val Ala Val Val Asa Gly His Leu Glu Ser Thr Glu Tie 
130 155 140 

Leu Leu Glu Ala Gly Ala Asp Pro Asn Gly Ser Arg His His Arg Ser 
145 150 15S 150 

Thr Pro Val Tyr His Ala Xaa Arg Val Gly Arg Asp Asp lie Leu Lys 
165 170 175 

Ala Leu lie Arg Tyr Giy Ala Asp Val Asp Val Ask His His Leu Asn 
180 185 130 

Ser Asp Thr Arg Pro Pro Phe Ser Arg Arg Leu Thr Ser Leu Val Val 
195 200 205 

Cys Pro Leu Tyr He Ser Ala Ala Tyr His Asn Leu Glu Cys Phe Arg 
210 215 220 

Leu Leu Leu Gin Ala Gly Ala Asa Pro Asp Phe Asn Cys Asn Gly Pro 
225 230 23S 240 

Val Asn Thr Gin Glu Phe Tyr Arg Gly Ser Pro Gly Cys Val M»u Asp 
245 250 25S 
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Ala Val Leu Arg His Gly Cys Glu Ala Ala Phe val Ser Lau Leu Val 
260 265 270 

Glu Phe Gly Ala Asn Leu Asa Leu Val Lys Trp Glu Ser Leu Gly Pro 
275 280 285 

Glu Ala Arg Gly Arg Arg Lys Met Asp Pro Glu Ala Leu Gin Val Phe 
290 295 300 

Lys Glu Ala Arg Ser lie Pro Arg Thr Leu Leu Ser Leu Cys Arg Val 
305 310 315 320 

Ala Val Arg Arg Ala Leu Gly Lys Tyr Arg Leu His Leu Val Pro Ser 
32 5 330 335 

Lea Pro Leu Pro Asp Pro lie Lys Lys Phe Leu Leu Tyr Glu 
340 345 350 



(2) INFORMATION FOR SSQ ID N0;2S: 

{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH ; 419 base pairs 

(B) TYPE: nucleic acid 
SCS STRANDEDNSSS ; single 
i 0 5 TOPOLOGY : 1 inear 

(iij MOLECULE TYPE; OSA 



(xi) SEQUENCE DESCRIPTION; SBQ ID HO; 26: 

GCAtCCATGG CGGAGGGCGG CAGCACGACG GGCGC3GCAGG GCCGGGCTCC GCAGGTCGTA 60 

ATC TGAAGGA GTGGCTGAG3 GAGCAATTTT GTGATCATCC GCTGGAGCAC TGTGAGGACA 12 0 

CGAGGCTCCA TGATGCAGCT TACGTCGGGG ACCTCCAGAC CCTCAGGAGC CTATTGCAAG 130 

AGGAGAGCTA CCGGAQCCGC ATCAACGAGA AGTCTOTCTG GTGCTGTGGC TGGCTCCCCT 240 

GCACACCGTT GCGAATCGCG GCCACTGCAG GCCATGGGAG CTGTGTGGAC TTCCTCATCC 3 00 

GGAAGGGGGC CGAGGTGGAT CTGGTGGACG TAAAAGGACA GACGGCCCTG TATGTGGCTG 3 60 
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TGGTGAACGG GCACCTAGAG AGTACCCAGA TCCTTCTCGA AGCTGSCGCG GACCCCAAC 419 

{2; INFORMATION FOR SEQ ID NO: 27: 

tij SEQUENCE CHARACTERISTICS: 

{A} LK3SGTH : 595 basse pairs 
{Bf TYPE: »«cleic acid 
(C> STRAKDEDSfESS : single 
(»} TOPOLOGY-, linear 

{iij MOLECULE TYPE; DHA 

(3t.i3 SEQUENCE DESCRIPTION: SEQ ID MO: 27: 

GAGGAAGAAG AAAAGTGGAC CCTGAGGCCT TGCAGGTCTT TAAAGAGGCC AGAASTGTTC 60 

CCAGAACCTT GCTGTGTCTG TGCCGTGTGG CTGTGAGAAG AGCTCTTGGC AAAACCGGCT ISO 

TCATCTGATT CCTTCGCTGC CTCTGCCAGA CCCCATAAAG AAGTTTCTAC TCCATGAGTA 180 

GACTCCAAGT GCTGCGGTTG ATTCCAGTGA GGGAGAAAGT GATCTGCAGG GAGGTGGACA 240 

CCGAGCCCTG AGTGCTGTGC TGCTGCTGGT CTCCTGATGG CTGTTGCTQC AGAAGATGTC 300 

CTCGTAGACT GTCATTGCTC CTCAGGTGCC TGGGCCGCTG AACAGTCCT? GGGTCATTGT 3 60 

CAGCTGAGAG GCTTATACTA AAGTTATTAT TGTTTTTCCC AAGTTCTCTG TTCTGGATTT 4 SO 

TCAGTTGCAT ATTAATOTAA CGGGCCATGG GGTATGTACA TGTAGGGGCT GAGGTTGGAG 480 

GCCTACTAAT TTCCTGTAGG GAAGACTCCC AGCACTTCTG GAACTGTGCT TCTCTTTATT 540 

TTTCTACTTC TCAATTTGAT GGTTCGATTA AAGCCTTCTA GTATCTCAAT GAAAA 59=5 

{23 INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS ; 

{A3 LENGTH; 896 base pairs 
(B) TYPE : nucleic, acid 
>C) STRAKDEOKBSS : single 
(D) TOPOLOGY; linear 
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(iif MOLECULE TYPH: DMA 



(ix) FEATURE: 

(A) NAM2/KBY: CDS 

(B) LOCATTOK: 4.. 396 



(Xi) SEQUENCE DESCRIPTION: SEQ ID 80:28: 

CTC! ATO TCC GCA ATT CTG AAG GTT GGA CAC CAC TGC TGG CTG CCT GTG 4.8 
H«t Ser Ala lie Leu Lys Val Gl.y His His Cys Trp Leu Pro Val 
I 5 10 15 

ACA TCC GCT GTC AAT CCC CAA AGO ATG CTG AGQ CCA CCA CCA ACC OCT 96 
Thr Ser Ala Val Asa Pro Gin Arg Met Leu A.rg Pro Pro Pro Tar Ala 
20 25 30 

GTT TTC AAC TGT GCC GCT TGC TGC TOT CTG TOG GGG CAG ATG CTG ATG 144 
Val Phe Asn Cys Ala Ala Cys Cys Cys Leu Trp Gly Gin Met leu Met: 
35 40 45 

AAT ACA TAG COT GTA GTT CAG CTT CCT GAG GAG GCC AAG GGC TTG GTG 122 
Asn Thr Tyr Arg Val Val Gin Leu pro Glu Glu Ala Lys Gly Leu Val 
5fi 55 60 

CCA CCA GAG ATT CTA CAG AAG TAC CAT GGA TTC TAC TCT TCC CTC TTT 24 0 

Pro Fro Glu lie Leu Glu Lys Tyr His Gly Phe Tyr Ser Ser Leu Phe 
65 70 75 

GCC TTG GTG AGO CAG CCC AGO TCG CTG CAG CAT CTC TGC CGT TGT GCG 288 
Ala Leu Val Arg Gin Pro Arg Ser Leu Glu His Leu Cys Arg Cys Ala 
80 85 90 95 

CTC CGC AGT CAC CTG GAG GGC TGT CTG CCC CAT GCA CTA CCG CGC CTT 335 
Leu Arg Ser His Leu Glu Gly Cys Leu Pro His Ala Leu Pro Arg Lev. 

100 105 no 

CCC CTG CCA COG CGC ATG CTC CGC TTT CTG CAG CTC. GAC TTT GAG GAT 384 
Pro Leu Pro Pro Arc; Met Lsu Arg Phe Leu Gin Leu Asp Phe Glu Asp 
115 120 12 5 

CTG CTC TAC TAQGCTTGCT GCCCTGTGAA CAAAGCAGAC CCCACCCCCA 433 

Leu Leu Tyr 
13 0 
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CCCCAAGGGC ATCTCTCAGC AATGAATGAT GCAAGGCGGT CTGTCTTCA& GTCAGGAGTG 493 

GACGCCOT6A TCCACAC TTG AGAGAAGAGG CCAGATCAGC ACCYGGCTGG TAGTGATNGC 553 

AGAGG G C ACC TGTGCAGATC TGTGTGCGCA CTGGAAATCT CTAGGCTGAA GGCYAGAGCA 613 

AATGGTGCAR GTGTTAGTCC TTGGGANGAG AGACAGAM3G TGAGAAAGCA AG AC AGAGG? €73 

GAGAGTGCAC ATCTCAAGTG GTAGATTGCC TTAAAAGAAA GCTAAAAAAA GAAAAAGAT? 733 

CQGGCGAACT TCTTTAGGGG TAATGCTGCA GCGTGTTAAA CTGACTGACC AGCGTCCATA 793 

TCTTTGOACC CTTCCCGGGT GAAAAAGCCC CTTCATOCTC CAGCGCTCCC CAAGGGTGCT 353 

TAG C AATACC GGGTGC'TTTT CTGCCGCAAA GTGAGTTACC AAA 896 

W IKFORKATXOK FOR SEC. ID NO: 29: 

{1} SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 130 amino acids 
iB) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: proteilS 

(xi! SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Mat Ser Ala "lie Leu Lys Val Gly His His Cys Trp Leu Pro val Thr 
I 5 10 15 

Ser Ala Val Asn Pro Gin Arg Met Leu Arg Pro Pro Pro Thr Ala Val 
20 25 30 

Phs Asn Cys Ala. Ala Cys Cys Cys Ley Trp Gly Gin Met Leu Met Asn 
35 40 45 

Thr Tyr Arg Val Val Gin Leu Pro Glu Qlvt Ala Lys Gly Leu Val Pro 
SO 55 60 

Pro Glu Il<a Leu Gin Lys Tyr His Gly Phe Tyr Ser Ser Leu Phe Ala 
65 70 75 80 

Leu Val Arg Gin Pro Arg Ser Leu Gin His Leu Cys Arg Cys Ala Lew 
65 90 95 
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Arg Ser His Leu Gla Sly Cys &eu Pre His Ala Lsu Pro Arg Lau Pro 
100 105 110 

Leu Pro Pro Arg Met L<au Arg Phe L*u Gin I»au Asp phe Glu Asp Leu 
115 120 125 

Leu Tyr 
130 

(2) I RF QRK&T I ON FOR SEQ ID NO: 30: 

(A) SEQUENCE CHARACTERISTICS: 

{A} JjEiSGTH : 436 base pairs 
(8; TYPE: nucleic acid 
{CI STRANDEDKESS : single 
{Oj TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 



(xi) SEQUENCE DESCRIPTION.- SEQ ID NO: 30: 

GTSGGOGCG? CATCATGACC TCCTCTAGGG CTC'FGCAACA TGACTCCTGT GGTGCAAATC SO 

AACAAATTGT TCACTGATGA ATCCACAAOG ATCTCTGGGC CTACAACCAG GTCCTGGTCC ISO 

ACATGACTGT CGTCTTCGGA GAAGGCACCA CTCGCCCCCG GCAGGTACGG CTGACACCTC 130 

CATGGGAGAA GACGTATCCA GGCAGCAGCT GCGCGGCCCT TCAAGAGGGC ACATCCCCTC 240 

ATC TAAAGGC ACGGTGTACT GAA3GTAGTC CTGA6AC&TG AGTCCGATTA CTACAGGCAC 300 

GTGTTCCTCC AGGTGGAGGC TCAGGTCCCC GGGTGAGCTG GGGCTGCAGC GGGACTCAGG 360 

GCGCGGCTCT GGCTGCAGGT CTCGCAGCTC CCTGGGCTGT AGCTCCCGCA GATCCTTGCG 420 

CACACCGTTG ACTOGT 435 
(2! INFORMATION FOR SEQ ID NO; 31: 
Si} SEQUENCE CHARACTERISTICS: 
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{A) LENGTH; 2180 base pairs 
{a; typk s nucleic acid 
{€.) STRAHDSRNESS v single 
{Dj TOPOLOGY: linear 

{ii} MOLECULE TYPE : UNA 

(xii SEQUENCE DESCRIPTION: SEC XD NO: 31: 

TTAATAGTAC CTACATAGTA GAAAATTATA ACTCCACTTT AAAACAATGT TTTCTTTCTA SO 

TTCAAATCAA TTTAAAACTT TTTATAAACA TTAATGTTGC AAGAGAATCC AGTCCATTTA 12 0 

TGAAAATTAG TTGACAATCA ASTTCACCCA AGAAAATGTT GACTAAGCTA AAGAAATCAC 180 

AGA7AAAACA TTTTACCAAA AGGATAGGTA ACACACAAAA AAATGCTATC ACAGGAAGCT 240 

ATGATCATCT AATATTTCTT TAATAATAAT TCTAGTTCCA TAGGTTTTCA TGTTATGCCA 300 

ATTTGTACCC GAGTTTAATT ACAGAAAAGG CAACAATTTC TAAATTGGTG GTATACATTT 3 SO 

dTTACAATT TTTTAATGTA AGGCCATTTA TTAAAATAGA CAAACTAGAA GATGAAAACG 420 

AAGGCAACAG AAAAATTCAA CTTTTCACAA CCAAAAGAAT TAOCACAACC TTAGAAATAA 4 80 

TTTAGAAAAA AGTGTTGTTA AAAGATATGT TGCAGATCTC CGTTCCATTA CCC AAGATTA 540 

TGTCAATTCA CGATTCTAAA TAAATCTTTT TAAAGTAAGA GATTAAAAAC 7CATCTTCAG 600 

TGTATATGTA AATTCCGTGO TTTTATCACA CAGGTATGTT TATTCAACAC TGCTTTGGAA 660 

ATGGACCAT? TAAAAGGACA TGGCAATTTC CATTCTGTTA AGTTTCATTC AACCTTTACT 720 

TAGGGGTTGA 1TACCACATG AAATCTGCTT TTAATOCATA AAAATCACAG TGGATTAGCC 780 

AGCAAAAGGG ACTGGGCGGG GGGGGCATTG AGGAGAATTT GATAATTCAC ATTGTC5ATTA 840 

TTCTGCACAT TGATGAAACA TAATTCACAC CTCTAAAACC TCAAG&CTTC CCTTTTTTAA 900 

AGAACCAAAA TAAACCCAAG ACACCTTGCT GACACTTCCC CACCCCTAAA CA&ACTOATG 360 

ACTCTTTTAC ACATAAAACT GAAATAGTTA TGGCAGCAAA AGATTTTGAT GGCAATGAAA 102 0 
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PCJ7AU97/W729 
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TTATTCCCAA AG'TGCAAGAT GCAGGGTTCT 1080 

AATCCTTCAT TTTGTTTGGC AAAGGCAGTT 1140 

GTATAACAAA ACGACACAGG TACTGCAACG 1200 

GGCAAGTTC? GACGGAAGTG CAGATTCCAG 1260 

CATTTTCAGA GTCCCTGATT G&ATGCTCCA 1320 

ACATCGGCTG TTCATAAAAG CTAAAC CTAC 1380 

TTTTACCATG GGAGCGAAAG TCACAGCTTA 1440 

CAAGAAAAGA ACCATCTGGC ACGTTTGCTA 1500 

GTGCCCAGTA CCATCCTTGC TTTGCAAGTT 15 GO 

CCATGGGACC ACTACTTTGC ACTGAGTCAT 1S20 

GATCAAAATT CAAATGACAG CGCATAACTT 1680 

TCCACTGAAA GTTCCCCTTT GGGATTTGGA 174 D 

AGATTGGAGG GACATCCATC GTGAACCCGC 1800 

GTGCCAATCA ACAAGCCATT CACCGGACTG I860 

GTCCTGGTCT ACCTGACTCT CATCCTCGGG 1920 

CGCAGAGACT TCCATGGGAG AAGAGCroTC 1980 

ATACATCCCC TCATCTAAAG GCACAGTATA 2040 

AACGACAGGC ACATGTTCAT CCAGGTGAAG 2100 

CTTCAGTGAA TPQGCTTGCT CCTGGCACGT 2160 

2180 



(2) INFORMATION FOR SHQ ID HOj32 : 
ii) SEQUENCE CHARACTERISTICS ; 
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(A) LENGTH: 2S49 base pairs 
(B5 TYPE : nucleic acid 
{C} STRANDEBKKSS : single 
(»} TOPOLOGY: linear 

fix) MOLECULE TYPE: DMA 

(xi> SEQUENCE DESCRIPTION; SSQ ID MO:32t 

GGC&CGAGGC TGTG'i'CC AGC ACACAGAGAG GGCCCGGCCA TCTGCTTTGG TTCAGAGCCC 60 

TGTG'TCTGTC TGTCACTTAG ACTCTTCCTC CCGGCTCGCA GCTCACCCTC CATCCTCCTT 120 

&CTGGCTCCA GCATGACTCG CTTCTCTTAT GCAGAGTACT TTGCTCTGTT TCACTCTGGC 180 

TCTGCACC'TT CCAGGTCCCC TTCGTCTCCC GA3AACCCAC CGGCCCGCGC ACCCCTGGGT 24 Q 

CTSWCCAAG GGGTCATGCA GAAGTATAGC AGCAACCTGT TCAAGACCTC CCAGATGGCG 300 

GCTATGGACC CCGTGCTGAA GGCCATCAAG GAAGGGGATG AAGAGGCCTT GAAGATCATG 360 

ATCCAGGATG GGAAGAATC? TGCAGAGCCC A1CAAGGAGG GCTGGCTGGC GCTCCACGAG 420 

GCTGCCTACT ATGGCCAGCT GGGCTGCCTG AAAGTCCTGC AGCAAGCCTA COCAGGGACC 480 

A'lTGACCAAC GCACACTGCA GGAAGAGACA GCATTATACC TGGCCACATG CAGAGAACAC 540 

CTGGATTGCC TCCTGTCGCT GCTCCAGGCG GGGGCAGAGC CTGACATCTC TAACAAATCC 600 

AGGGAGACTC CACTTTACAA AGCCTGTGAG CGCAAGAACG CGGAGGCGGT GAGGATATTG 660 

GTGCGATACA ACGCAGACGC CAACCACCGC TGTAACAGGG GCTGGACCGC ACTGCACGAG 720 

TCTGTCTCCC GCAATGACCT GGAGGTCA'J'G GJVQAtCCTAG TGAGTGGCGG GGCCAAGGTG 780 

GAGGCCAAGA ATGTCTACAG CATCACCCCT TTGTTTGTOG CTGCCCAGAG TGGGCAGCTG 840 

GAGGCCCTGA GGTTCCTGGC CAAGCATGOT GCAGACATCA ACACGCAGGC CAGTGACAGT 900 

GCATCAGCCC TCTACGAGGC CAGCAAGAAT GAGCATGAAG ACGTGGTAGA GTTTCTTCTC 960 

TCTCAGGGCG CCGATGCTAA CAAAGCCAAC AAS3SACGGCC TOCTCCCCCT GCATGTTGCC 102 0 
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TCCAAGAAGG GCAACTATAG AATAGTGCAG ATGCTGCTGC CTCiXSACCAG CCGCACGCGC 1080 

C5TGCGCCCJTA GCGGCATCAG CCCGCTOCAC CTAGCGGCOG AGCGCAACCA CGACGCGGTG 1140 

CTGGAGGCGC TGCTGGCCGC GCSCTTCO&C GTGAACGCAC CTCTGGCTCC CGAGCGCGCC 1200 

CGCCTCTACG AGGACCGCCG CAGTTCTGCG CTCTACTTCG CTGTGGTCAA CAACAATGTG 12 60 

TACGCCACCG AGCTG'PTGCT GCTGGCGGGC GCGGACCCCA ACCGCGATGT CATCAGCCCT 1320 

CTGCTCGTGG CCA'i'CCGCCA CGGC'I'GCCTG CGCACCATGC AGCTGCTGTT GGACCATGGC 1380 

GCCAACATCG ACGCCTACAT CGCCACTCAC CCCACCOCCT TTCCAGCCAC CATCATGTT7 1440 

GCCATGAAGT GCCTGTCGW ACI'CAAGTTC CTTA'TGGACC TCGGCTGCGA TGGCGAGCCC 1500 

TGCTTCTCCT GCCTGTACGG CAACGGGCCG CACCACCCGC CCCCCGACCT GGCCGCTTCC ISfiC! 

ACGACGCACC CGTGGACGAC AAGGCACCTA GCGTGGTGCA GTTCTGTGAG TTCCTGTCGG 1620 

CCCCGGAAG? GAGCCGCTGG GCGGGACCCA TCATCGA'KS? CCTCC'X'GGAC TATG7QGGCA 1680 

ACGTGCAGCT GT3CTCCCGG CTGAAGGAGC ACATCGACAG CTTTGAGGAC TGGGCTGTCA 1740 

TCAAGGAGAA GGCAGAACCT CCGAGACCTC TGGCTCACCT CTGCCGGCTG CGGGTTCGGA 1800 

AGGCCATAGG AAAATACCGG ATAAAACTGC TGGACACACT GCCOCTTCCC GGCAGGCTAA I860 

TCAGATACTT GAAATATGAG AATACACAGT AACCAGCCTG GAGAGGAGAT GTGGCCTTCA 1920 

GACTGTTTCC GGOAOGCCCC AGGTGGCCTG CATCCAGGAC CCCCTGGGGT CAGAACAGGT 1980 

GTGACCTTGC TGGTTCTTVQ CTGGAGCTTC ACCCAAAGTG AGAACCTGAT GTGGGGAGTG 2040 

GACGTGSAAC CTCTGCTTTC ACACTGTCAO CGGATCGCAG ACCCGCTCTG CTTCTGGCCA 2100 

TAGCCAGAGA CCTTC A AC CT GGGGCCAGGG GAGAGCTGGT CTGGGCAAGG TGGCCCAGGC 21 SO 

AGGAATCCTG GCCTTAAGCT GGAGAACTTG TAGGAATCCC TCACTGGACC CTCAGCTTTC 2220 

AGGCTGCGAG GGAGACGCCC AGCCCAAGTA TTTTATTTCC GTGACACAAT AACGTTGTAT 2280 

CAGAAAAAAA AAAAAACATG GOCGCAGCTT ATTCCTTAGT AGGGTATTTA CTTGCATGCG 2340 

CGCTTAAAGC TACTGGAAAC ATGCGTTCCA CTATGCTTCA GAATCCCCTT GCACTGGTAA 2400 
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ACGAGAGCCG ACGTGCTTCA AGGTTGGAT7 TTTGGTTGCC CCTTTGGCGT TCCGCGGGTT 2460 

TSTCCOACGT AATTGACCCC QTGTTTTGTC ACTTTCGAGT OTTCCG&.CTA TTGGGGGGCT 2520 

TTTGG TTGTC CCCAAAATTG TGGGTGGTGT GCGGACGCCA CGAGAAGTGG TTCATGGGCG 2580 

ATAATCATTA CTGGAGAATG TAGAGCGGCG GTTTTACGAA TAAATATTTT TTAAGCCGCC 2 640 

TTCCCAAAA 2849 

{2} INFORMATION FOR SSQ ID NO: 33: 

it) SEQUENCE CHARACTERISTICS: 

(A) LENGTH! 495 base pairs 
(Bi TYPE: nucleic acid 
SC} STRANDSDMESS : single 
iDj TOPOLOGY s linear 

{iii MOLECULE TYPE: ENA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CCTCCTGA6A GTTCGCCGGC CCGGGCCCAA TGGGTTGTTC CAAGGGGTCA TGCAGAAATA 60 

CAGCAGCAGC TTGTTCAAGA CCTCCCAGCT GGCGCCTGCG GACCCCTTGA TAAAGGCCAT 120 

CAAGGATGCG ATGAAGAGGC CTTGAAGACC ATGATCAAGG AAGGGAAGAA TCTCGCAGAG 180 

CCCAACAAGG AGGGCTGGCT GCCGCTGCAC GAGGCCGCAT ACTATGGCCA GGTGGGCTGC 240 

CTGAAAGTCC TGCAGCGAGC GTACCCASGG ACCATCGACC AGCGCACCCT GCAGGAGGAA 300 

ACAGCCGTTT ACTTGGCAAC GTGCAGGGGC CACCTGGACT GTCTCCTGTC ACTGCTCCAA 360 

GCAGGGGCAG AGCGGGACAT CTCCAACAAA TCCCGAGAGA ACCOCTCTAC AAAGCCTGTG 42 0 

AGCGCAAGAA CGCGGAAGCC GTGAAGATTC TTGGT3CAGC ACAACGCAGA CACCAACAAC 460 

GCTGCAACCG GGCTG 495 
■12) INFORMATION FOR SEQ ID NO: 34; 

ii) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 709 base pa ir.<s 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS ; single 

(D) TOPOLOGY; Iine<sr 

iii) MOLECULE TYPE; DMA 

Ui> SEQUENCE DESCRIPTION: SEQ 10 NO: 34: 

GTGCAGCTCT GCTCC5CGGCT GAAGSAACAC ATCGACAGCT TTGAGGACTG GGCCGTCATC 60 

AAGGAGAAGG CAGAACCTCC AAGACCTCTG OCTCACCTTT GCCGACTGCG GGTTCGAAAG 120 

GCCATTGGGA AATACCGTAT AAAACTCCTA GACACCTTGC CGCTCCCAGG CASGCTGATT 180 

AGATACCTGA AA'TACGAGAA CACCCAGTAA CTGGGGCCAC GGGGAGAGAG GAGTAGCCCC 240 

TCAGACTCTT CTTACTAAGT CTCAGGACGT CGGTGTTCCC AACTCCAAGG GGACCTGGTG 300 

ACAGACGAGG CTGCAGGCTG CCTCCCTCTC AGCCTGGACA GCTACCAGGA TCTCACTGGG 360 

TCTCAGGGCC CAGAGCTTTG GCCAGAGCAG AGAACAGAAT GTGTCAAGGA GAAGAATCAT 420 

TTGTTTACAA ACTGATGAGC AGATCCCAGA CCTTCTCTAC C TTCAGGAAT GGCAGAAACC 480 

TCTATTCCTG GGGCCAGGGC AGAGCTTGAG GTGTTCTGGG GAAGGTQGTG CTCAGAGCCT 540 

TCCCTGTGCC CCTCCACTTG 'X'TCTGGAAAA CTCACCACT? GACTTCAGAG CTTTCTCTCC 600 

AAAGACTAAG ATGAAG AC G'T GGCCCAAGGT AGGGGGTAGG GGGAGCCTGG GTCTTGGAGG 660 

GCTTTGTTAA GTATTAATAT AATAAATGTT ACACATGTGA AAAAAAAAA 709 

(2) INFORMATION FOR SEQ IB MO .-35; 

(i! SEQUENCE CHARACTERISTICS: 

(A3 LENGTH: 848 base pairs 
(8) TYPE: nucleic acid 
(C) STRANDSDNISSS : single 
(O) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DBA 



fix) FEATURE; 

(A) NAME/RE*; CDS 
{3} LOCATION: 1. .624 



ixi) SEQUENCE DESCRIPTION: SEQ 



TTG 


GAG 


AAG 


TGT 


GGT 


TGG 


TAT 


TGG 


GGG 




Glil 


Lys 


Cys 


Gly 


Trp 


Tyr 


Trp 


Gly 


1 








5 










GAG 


ATG 


AAG 


CTG 


AAA 


GGG 


AAA 


CCA 


GAT 


GIu 


Met 


Lys 


Laii 


Lys 


Gly 


Lys 


Pro 


Asp 








20 










25 


AGT 


TCT 


GAT 


CCT 


CGT 


TAC 


ATC 


CTG 


AGC 


Ser 


S»r Asp 


Pro 


Arg 


Tyr 


llB 


Levt 


Ser 






35 










40 




ATC 


ACC 


CAC 


CAC 


ACT 


AGA 


ATG 


GAG 


CAC 


lie 


Thr 


His 


Kis 


Thr 


Arg 


Met 


GIu 


His 




SO 










53 






TOG 


TOT 


CAT 


ccc 


AAG 


TTT 


GAG 


GAC 


CGC 


Trp Cys 


His 


Pro 


Lys 


Phe 


GXu 


Asp 


Arg 


65 










70 








ATT 


AAG 


AGA 


GCC 


ATT 


ATG 


CAC 


TCC 


AAG 


He 


Lys 


Arg 


Ala 


lie 




His 


Ser 


Lys 










85 










TT'A 


AGA 


TCC 


AGG 


GIT 


CCA 


GGA 


CTG 


CCA 


Leu Arg 


Ser 


Arg 


val 


Pro 


Gly 


Leu 


Pro 








100 










105 


TAT 


CCA 


GTG 


TCC 


CGA 


TTC 


AGC 


AAT 


GTC 


Tyr 


Pro 


Val 


Ser 


Arg 


Phe 


Ser 


Asrs 


Val. 






115 










120 




AGA 


TTC 


CGG 


ATA 


CGA 


CAG 


CTC 


GTC 


AGG 


Arg 


Phe 


Arg 


lie 


Arg 


Gin 


La 5i 


Val 


Arg 




130 










13 5 







ID NO:35; 

CCA ATG AAT TGG GAA GAT GCA 48 
Pro Met Asn Trp Gl\3 Asp Ala 
10 15 

GGT TCT TTC CTG GTA CGA GAC 96 
Gly Ser Phe Leu Val Arg Asp 
30 

CTC AGT TTC CGA TCA CAG GGT 144 
Leu Ser Phe Arg Ser Gin Gly 
45 

TAC AGA GGA ACC TTC AGC CTG 192 
Tyr Arg Gly Thr Phe Ser Leu 
60 

TGT CAA TCT GTT GTA GAG TTT 240 
Cys Gin Ser Val Val Gla Phe 
75 80 

AAT GGA AAG TTT CTC TAT TTC 288 
Asn Gly Lys Phe Lew Tyr Phe 
90 55 

CCA ACT CCT GTC CAG CTG CTC 336 
Pro Thr Pro Val Gin Lea Leu 
110 

AAA TCC CTC CAG CAC CTT TGC 384 
Lys Ser Leu Gin His Leu Cys 
125 

ATA GAT CAC ATC CCA GAT CTC 432 
Lie Asp His He Pro Asp Leu 
140 
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CCA COG CCT AAA CCT CTG ATC TCT TAT ATC CGA AAG TTC TAG TAC TAT 480 
Pro Leu Pro Lys Pro Leu lie Ser Tyr He Arg Lys Phe Tyr Tyr Tyr 
145 150 1.55 160 

GAT CCT CAS GAA GAG OTA TAC CTG TCT CTA AAG GAA GCG CAG COT CAG 528 
Asp Pro Gin Glu Glu Val Tyr Leu Ser Leu Lys Glu Ala Gin Arg Gin 
165 170 175 

TTT CCA AAC AGA AGC AAG AGO TGG AAC CCT CCA CGT AGO GAG GGG CTC 576 
Phe Pro Asa Arg Ser Lys Arg Tip Asn Pro Pro Arg Sar Glu Gly Leu 
180 185 190 

CCT GOT GOT CAG CAC CAA GGG CAT TTG OTT OCC AAG CTC CAG CTT TGAAGAACCA 

S 31 

Pro Ala. Gly His His Gin Gly Mis Leu Val Ala Lys Leu Gin Leu 
195 200 205 

AATTAAGCTA CCATGAAAAG AAGAGGAAAA GTGAGGGAAC AGGAAGGTTG GGATTCTCTG £91 

TGCAGAGACT TTGGTTCCCC ACGCAAGCCC TGGGGCTTGG AAGAAGCACA TGACCGTACT 751 

CTGCGTGGGG CTCCACCTCA CACCCACCCC TGGGCATCTT AGGACTGGAG GGGCTCCTTG 811 

GAAAACTGGA AGAAGTCTCA ACACTGTTTC TTTTTCA 848 



(2.) INFORMATION PCS SSQ 10 HO: 36: 

(i) SEQUENCE CHARACTERISTICS; 

(A3 LENGTH: 207 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

ixi} SEQUENCE BBSCRIFl'lOK; SEQ ID NG:3S- 

Leu Glu Lys Cys Gly Trp Tyr Trp Gly Pro Met Asn Trp Glu Asp Ala 
1 5 10 IS 

Glu Met Lys Leu Lys Gly Lys Pro Asp Gly Ser Phm Leu Val Arg Asp 
20 25 30 

Ser Ser Asp Pro Arg Tyr lie Lea Ser Leu Ser Phe Arg Ser Gin Gly 
35 40 45 
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Xle Thr His His Thr Arg Met Glu His Tyr Arg Oly Thr Phe Ser Leu 
50 55 60 

Trp Cys His Pro Lys Phe Glu Asp Arg Cys Gin Ser val Val Glu Phe 
65 70 75 80 

lie Lys Arg Ala lie Met His Ser Lys Asn Sly fcys Phe Leu Tyr Phe 
SS 90 95 

Leu Arg Ser Arg Val Pro Gly Leu Pro Pre Thr Pro Val Gin Leu X,eu 

icq ics no 

Tyr Pro Val Ser Arg Phe Ser Asn Val Lys Ser Leu Gin His Leu Cys 
115 120 125 

Arg Phe Arg lie Arg Gin Leu Val Arg He Asp His lie Pro Asp Leu 
130 135 140 

Pro Leu Pro Lys Pro Leu lie Ser Tyr He Arg Lys Phe Tyr Tyr Tyr 
145 150 155 160 

Asp ?x'0 Girl Glu Glu Val Tyr Leu Sex Leu Lys Glu Ala Gin Arg Gin 
165 170 175 

Phe Pro Asn Arg Ser Lys Arg Trp Asn Pre Pro Arg Ser Glu Gly Leu 
ISO 185 ISO 

Pro Ala Gly His His Gin Gly His Leu Val Ala Lys Leu Gin Leu 
195 200 20S 



(2) INFORMATION PGR SEQ ID MO; 37: 

{1} SEQUSSJCE CHARACTERISTICS ; 

(A) LENGTH: 464 base pairs 
SB) TYPE: nucleic acid 

(C) 8TRANDEDNESS : single 

(D) TOPOLOGY) linear 

(ii) MOLECULE TYPE ; IMA 



Ui) SEQUENCE DESCRIPTION: SEQ 13 80:37: 
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GTTCCAAGCC TAACCCATCT TTG'TCCTTTG <3AAA9raCGGG CCAGTCTAAA AGCAGAGCAC SO 

CTTCACTCTG ACATTTTCAT CCA.TCACWFG CCACTTCCCA GAAGTCTGCA GAACTATTTG 120 

CTCTATGAAG AGG'TTTTAAG AATGAATGAG ATTCTAGAAC CAGCAGCTAA TCAGGATGGA 180 

GAAACCAGCA AGGCCACC TG ACACAGG7CC TTTAATTCTG TTTAGTCACA AAAGACGGC? 240 

■TGTGTGACTG TTTGGATTTG GTGATCAAAT GTCCATGTTT ACAGTTGCTT 7TCCC AGT'TT 300 

GTGTCTTTCC CAATATTGTG AACCTTATCC ATCTTGCCTT ACTCAGTTTT ATTTGTAGTG 360 

CACT-TTGTTG TGTATTATTT GTTTACCTGA CCATTTTCTA CTTTATTCTG CTAATAAACT 420 

GTAATTCTGA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 464 

{2} INFORMATION FOR SEQ ID NO :38s 

U) SEQUENCE CHARACTERISTICS: 

{AS LENGTH: 74? base pairs 
<3) TYPE : nucleic acid 
(C) STRAJ3DEDNBSS : single 
(T>) TOPOLOGY : iirsear 

(ii) MOLECULE TYPE ; DNA 

ixi) SEQUENCE DESCRIPTION: SB© ID NO: 38; 

GGGGATCGAA AGCGGGGGCT TCTOGGACGC AGCTCTGGAG ACGCGGCCTC GGACCAGCCA 60 

TTTCGGTGTA GAAGTGGCAG CACGGCAGAC TGGTCAAACA AATGSATTTT ACAGAGGCTT 120 

ACGC GGAC AC GTGCTCTACA GTTGGACTTG CTGCCAGGGA AGGCAATGTT AAAGTCTTAA 180 

GGAAACTGCT CAAAAAGGGC CGAAGTGTCG ATSTTGCTGA TAACAGGGGA TGGATGCCAA 240 

TTCATGAAGC AGCTTATCAC AACTCTGTAG AATGTTOGCA AATGTTAATT AATGCAGATT 300 

CATCTGAAAA CTACATTAAG ATGAAGACCT TTGAAGGTTT CTGTGCTTTG CATCTCGCTG 3 60 

CAAGTCAAGG ACATTGGAAA ATCGTACAGA TTCTTTTAGA AGCTGGGGCA GATCCTAATG 420 

CAAC TACTTT AGAAGAAACG ACAGCATTGT TTTTAGCTGT TGAAAATGGA CAGATAGATG 480 
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TGTTAAGGCT GTTGCTTCAA CACGGAGCAA ATGTTAATGG ATCCCATTCT ATGTGTGGAT 540 

GG AAC TC CTT GCACCAGGCT TCTTTTCROG AAAATGCTGA GATCATAAAA TTGCTTCTTA 600 

GAAAAGGAGC AAACAAGGAA TGCCAGGATG ACTTTGOAAT CACACCTTTA TTTGTGC-CTG 6 SO 

CTCAGTATGG CCAAGCTAG& AAGCTTTGAA GCATACTTAT TTCATCCGGG TGCAAATGTC 720 

AATTGTC AAG CCTTGGACAA AGCTACC 747 

(2) INFORMATION FOE SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A! IiSNGTHx 10.18 base pairs 
(Si TYPE : nuclei c acid 
( C ! STRANDEDNSSS : S ing 1 g 
{D} TOPOLOGY : linear 

(.ii) MOJjECUTjE TYPE; DMA 

(Xi3 SEQUENCE DESCRIPTION: S2Q IS SO: 39: 

CACAAATGGG ACCATACAAA AATCTTGGAC TTGTTAATAA CCACTTACTA ACCGGGACCT 60 

GTGACACTGG GCTAAACAAA GTAAGTCCCT GTTTACTCAG CAGTGTTTGG GGGACATGAA 12 0 

GGATTGCCTA GAAATATTAC TCCGGAATGG TCTACAGCCC AGACGCCCAG GCGTGCCTTG 183 

TTTTTGGATT CAGTTCTCCT GTGTGCATGG CTTTCCAA&A GGAGGTGGAG CTGTAGTTCT 240 

TTGGAATTGT GAACATTCTT TTGAAATATG GAGCCCAGAT AAATGAACTT CATTTGGCAT 3 00 

ACT GC C TG AA GTACGAGAAG TTTTCGATAT TTCGCTACTT TTTGAGGAAA GGTTGCTCAT 3 60 

TGGGACCATG GAACCATATA TATGAATTTG TAAATCATGC AATTAAAGCA CAAGCAAAAT 420 

ATAAGGAGTG GTTGCCACAT CTTCTGGTTG CTGGATTTGA CCCACTGATT CTACTGTGCA 480 

ATTCTTGGAT TGACTCAGTC AGCATTGACA CCCTTATCTT CACTTTGGAQ TTTACTAATT 540 

GG A AG AC AC T TGCACCAGCT GTTGAAAGGA TGCTCTCTGC TCGTGCCTCA AACGCTTGGA 600 

TTCTACAGCA ACATAT'TGCC CACTGTTCCA TCCCTGACCC ATCTTTGTCG TTTGGAAATT 660 

SUBSTITUTE SHEET (RULE 28) 



WO 98/20023 PCT/AU97/00729 

- 162- 

CGGTCCAGTC TAA&ATC AG A ACGTCTACG6 TCTOACAGTT ATATTAGTCA GCTGCCACTT 72 0 

CCCAGAAGCC TACATAATTA TTOGCTCTA.T GAAGACGTTC TGAGGATGTA TGAAGTTCCA 780 
GAACTGGCAG CTATTCAAGA TGGATAAATC AGTGAAACTA CTTAACACAG CTAATTTTTT 840 
TCTCTGAAAA ATCATCGAGA CAAAAGAGCC ACAGAGTACA AGTTTTTATG ATTTTATAGT S00 
CAAAAGATGA TTATTGATTG TCAGATAGGT TAGGTTTTGG GGGGCCAGTA GTTCAGTGAG 960 

AATGTTTATQ TTTACAACTA GCCTTCCCAG TAAAAAA.AAA AAAAAAAAAA AAAAAAAA 1018 

(2) INFORMATION PCS SEQ ID NO; 4 Or 

(i) SEQUENCE CHARACTERISTICS} 

(A! LENGTH; 1897 base pairs 
<E! TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

■xi> MOLECULE TYPE; DNA 



(xi) SEQUENCE DESCRIPTION: SBQ ID HO: 40: 

CGGGGGGCTG GGACCTGGGG CGTAACCG7C TCTACCACGA CGGCAAGAAC CAGCCAAGTA 60 

AAACATACCC AGCCT'TTCTG GAGCCGGACG AGACATTCAT 7GTCCCTGAC TCCTTTTTCG 120 

TGGCCCTGGA CATGRATGAT GGGACCTTAA GTTTCATCGT GGATGGACAG TACATGGGAG 180 

TGGCTTTCCG GGGACTCAAG GGTAAAAAGC TGTATCCTGT AGTGAGTGCC GTCTGGOGCC 240 

ACTGTGAGAT CCGCA'i'GCGC TACTTGAACG GACTTGATCC TGAGCCCCTG CCACTCATGG 300 

ACCTGTGCCG GCGTTCGGTG CGCCTAGCGC TGGGAAAAGA GCGCCTGGGT GCCATCCCCG 3 60 

CTCTGCCGCT ACCTGCCTCC CTCAAAGCCT ACCTCCSCTA CCAGTGATCG ACATCCCAGG 420 

ACCGCCATAC GACAGCCATC TGGTGCCAAR TCACTGAGCC CGTTGGGGTC CGCCGACCCC S8« 

TGCGCCTGGG ATGGAAGCCC ACCTCAGCCA TGGGCAGACG TOCCCCCTCA TCCTACCGGC 540 
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TGCCTCTGCT GGGGGAACCT ATGCCAACGG ACTTCTCCCT TCCCAACACT GGCTGAAGCA 500 

GCAGCACCCA GGCCCTTOCC TGAACCAGAT GCftSAGAAYA AACTATGAAA ACCTCTCTCA 660 

GGCGCC'PTCT GCTCTCAGGT GGAGTGGGC? GCCCCCCACT C1CTGCAGAG AGAGGCTACA 720 

CCCACCTGGG GGG'TCCTGGG AGGTAAGACT AGTAGGAGGT GCCAGGGCTG AHTCCAAAAG 780 

CACfGflAIGGC CAGGAJ4CAGG CCATACAGAT GAAGCTCAGG ATGTCACATA CCATGGACAM 840 

TGAGACAGAA CCCCAGGTTG GAKTTCCCTT GGGCCAACGA GTGCCAGCTT TAATGTCAGC 900 

TGCMGG'i'GCT C'TGTGGCC'PG TATTTATTCT TTAAACAGTA GCAAAOGCCA TTTATTTATT 960 

CCACT'TAGAA AGGAA&CCTT GGTGGGTGGY TOCCCI'CGAV OTGCTTTCCC CCACCTCCCT 1Q20 

GGAATGTGTG TGCCACACCT OTCCTK3TCC CAGGCCAGGA CTQTGGCACA TGAGCTGGI'G 1080 

TGCACAGATA CACGTATGTC GTCGTGCATG ACCCC3X3ACT AGTTCCTAAG TAGCCCTGCA 1140 

CCAAGCACCA GAGCAGACCC CAAGAGAGGC CCGTGCAAG? CCCCATGTCC CCAGGTCCCT 1200 

GCTTCTGTTG CCTTGGGACT CATACACCGG CACACGTGTT TCAGCCTCTT GACTTCCATG 1250 

AGCTTCGAAT TTTGCCCCCG ATTCTTCTGA TATTTCCCAT TGGCATCCTC CAAAGCTCTG 1320 

GGCCTGGAGG GCATTAGGAC ACATGGAATG AGTGGGGTCT CCAGCCCCTG GGAAAGCCAC 13 80 

TGGCAAGGCA GGATTAGAAA GACCAAGAGC AGGGTGGGGC GCGATGAAGC CTGTATGCCT 1440 

CTCAGGCTCA AGACCCCGCC ACACACCCAC TCAAGCCTCA GAAGTGGTGT GTAGGGCAGC 1500 

CCCA3GAGAG GAATGCCTGT CCTAGCAGCA CGTACATGGA GCACCCCACA TGTGCl'CCAG 1560 

CCCTCTGGC? GTTTCTCTTG C'i'CTAGAATC AACTCCCTAC ATTGGGAATG TAGCCATTTG 1620 

GTAGAGGACT TGCCTAGCCT GCAGGAAGCT CACGTTCCAT CCCCTGCACC AAGGAGAATC 1680 

AAAGCTCAGG AGGCTGAGGC AGGAGGATTG CTGTCAGTGG TGTACAGAGG TCATGGCCAT 1740 

CCTGGGCTA'T ATTAAACCTT GTCCTTTAAG AAAAAGAAAA GAAATCAACT TCCATTGAAT 1800 

CTGAGTTCTG CTCATTTCTG CACAGGTACA ATAGATGACT TRATTTGTTG AAAAATGKTT I860 

AATATATTTA CKTATATA7A TATTTGTAAG AAGCATT 189? 
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{2} INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 ajrdno acids 
(3) TYPE: amino acid 
<C) STSAHDEDNSSS : single 
{D} TOPOLOGY: linear 

tii) HOLSCUi.fi TYPE: DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ 1T> KG: 41; 

Gly Oly Trp Asp Leu Gly Arg Asn Arg Leu 'ivr His Asp Gly Lys Asn 
1 5 10 15 

Gin Pro Ser Lys Thr Tyr Pro Ala ?he Leu Glu Pro Asp Glu Thr Phe 
20 25 30 

He Val Pro Asp Ser Phe Phe Val Ala Leu Asp Met Xaa Asp Gly Thr 
35 40 45 

Leu Ser Phe lie Val Asp Gly Gin Tyr Met Gly Val Ala pfae Arg Gly 
50 55 60 

Leu Lys Giy Lys Lys Leu Tyr Pro Val Val Ser Ala Val Trp Gly His 
65 70 75 80 

Cys Glu lie Arg Met Arg Tyr Leu Asn Gly Leu Asp Pro Glu Pro Leu 
85 90 95 

Pro Leu Met Asp Leu Cys Arg Arg Ser Val Arg Leu Ala Leu Gly Lys 
100 105 HO 

Glu Arg Leu Gly Ala lis Pro Ala Leu Pro Leu Pro Ala Ser Leu Lys 
115 120 125 

Ala Tyr Leu Leu Tyr Gin 

13 0 



(2) INFORMATICS FOR SEQ ID 3*0:42 i 
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fit SBQOENCB CHARACTERISTICS : 

iA) LENGTH : 265 base pairs 
;B) TYPE: nucleic acid 
iC) STRASSKJSDHESS : single 
(Di TOPOLOGY; linear 

(ii) MOLECULE TYPE : ONA 



<x-; ; SEQUENCE DESCRIPTION: SBQ ID NO: 42: 

AAGGGTAAAA AACTGTATCC TGTAGTGAGT GCCGTCTGGG GCCACTGTAG ATCCGAATGC 60 

GCTACTTGAA CGGACTCGAT CCCGAGACTG CCGCTCATGG ATTTGTGCCG TCGCTCGGTG 130 

C6CCT0GCCC TGGGGAGGGA GCGCCTGGGG GAGAACCACA CCTGCCGCTG CCGGCTTCCC 180 

TCAAGGCCTA CCTCCTCTAC CAGTGACGTT CGCCATCATA CCGCCAGCGC GACAGCCACC 240 

TGGTGCCAAC TCACTGAGCC GCCTG 265 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH ; 2438 base pairs 
{2) TYPE: nucleic acid 
(C5 STRANDEBNESS : single 
(D) TOPOLOGY; linear 

MOLECULE TYPE: ONA 



Cxi) SEQUENCE: DESCRIPTION: SEQ ID NO: 43: 
A AGTGGC GGC GGTCCCTGGA GAGCAGGCGG AGGOAGCGGC AAGTCTGACT CTGGGCTGAC 
CGTGOAGCC'G GGGCGGGGGC TGACAGCCAG GCCTCCGCCT GGCGGGAGCC GCACGAGGAG 
CGGG AGTGGC CGGGCCTCTC TTCCGCGCTT GAGCGAGCGC CGGGTGATGG CGGTGGTGAT 
GGCGGCAGGC GCTCGGACAG CTCCGCTTGA GCTGAGCTCG GAGAGATCCG TCCAGAAAGT 
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GCCCAGAAGA AACTTCCTCT TAGAAAAGCT GAAAAACACA RTATTTATAA C AC TGGAAAT 300 

TGTAAAGAAT TTGTTTAAAA TGGCTGAAAA CAATAGTAAA AATGTAGATG TACGGCCTAA 360 

AACAAGTCC3G AGTCGAAGTG CTGACAGGAA GGATGGTTAT GTGTGGAGTG GAAAGAAGTT 420 

GTCTTGGTCC AAAAAGAGTG AGAGTTGTTC TGAATCTGAA GCCATAGGTA CTGTTGAGAA 480 

TGTTGAAATT CCTCTAAGAA GCCAAGAAAG GCAGCTTAGC TGTTCGTCCA TTGAGTTGGA 540 

CTTAOATCAT TCCTGTGGGC ATAGATTTTT AGGCCGATCC CTTAAACAGA AACTGCAAGA 600 

TGCGGTGGGG CAGTGTTTTC CAATAAAGAA TTGTAGTGGC CGACACTCTC CAGGGCTTCC 6 SO 

ATCTAAAAGA AAGATTCATA TCAGTGAACT CATGTTAGAT AAGTGCCCTT TCCCACCTCG 720 

CTCAGATTTA GCCTTTAGGT GGCATTTTAT TAAACGACAC ACTGTTCCTA TGAGTCCCAA 780 

CTCAGATGAA TGGGTGAGTG CAGACCTGTC TGAGAGGAAA CTGAGAGATG CTCAGCTGAA 840 

ACGAAGAAAC ACAGAAGATG ACATACCCTG TTTC'i'CACAT ACCAATGGCC AGCCTTGTGT 900 

CATAACTGCC AACAGTGCTT CG TG TACAGG TGGTCACATA ACTGGTTCTA TGATGAACTT $60 

GGTCACAAAC AACAGCATAG AAGACAGTGA CATGGATTCA GAGGATGAAA TTATAACGCT 1020 

GTGCACAAGC TCCAGAAAAA GGAa.TAA.GCC CAGGTGGGAA ATGGAAGAGG AGATCCTGCA 1080 

GTTGGAGGCA CCTCCTAAGT TCCACACCCA GATCGACTAC GTCCACTGCC TTGTTCCAGA 1140 

CCTCCTTCAG ATCAGTAACA ATCCGTGCTA CTGGGGTGTC ATGGACAAAT ATGCAGCCGA 12 00 

AGCTCTGCTG GAAGGAAAGC CAGAGGGCAC CTT7TTACTT CGAGATTCAQ CGCAGGAAGA 12 60 

TTATTTATTC TCTGTTAGTT TTAGACGCTA CAGTCGTTC? CTTCATGCTA GAATTGAGCA 1320 

GTGG A ATC AT AACTTTAOCT TTGATGCCCA TGATCCTTGT OTCTTCCATT CTCCTGATAT 1300 

TACTGGGCTC C TG G A AC ACT ATAAGGACCC CAGTGCCTGT ATGTTCTTTG AGCCGCTCTT 1440 

GTCCACTGCC TTAATCCGGA CGTTCCCCTT TTCCTTGCAG CATATTTGCA GAACGGTTAT 1500 

TTGTAATTGT ACGACTTACG ATGGCATCGA TGCCCTTCCC ATTCCTTCGC CTATGAAATT 1560 

GTATCTGAAG GAATACCATT ATAAATCAAA AGTTAGGTTA CTCAGGATTG ATGTGCCAGA 1620 
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GCAGCAGTGA TGCGGAGAGG TTAGAATGTC GACCTGCATA CATAXTTTCA TTTAAT&TTT 1680 

TATfTTTCTT ATGCCTCTTT QAA.TTCTTOT ACAAAGGCAG TTGAATCA&A TAAAACTGTG 1740 

CCCTAAGTTT TAATTCCAGA TCAATTTATT TTTTTTATGA T&CACTTGT? ATATATTTTT 1800 

AAGCASGTG? TTGGTTT7GT TTTTACCATA TAAATTTACA TATGGTCCAG GCATATTTAC "1860 

AATTTCAAGG CATTGCATAT ACATTTGAAT ATTCTGTATT TTTTAAATAA TCTTTTGTTC 1920 

TTTCCTATGT GTGAAATATT TTGCTAATC? ATGCTATCAG TATTCTTGTA TGACCGAATA 1980 

GTTACCTATT CTCTTTTCAT CTTGAAGATT TTCAGTAAAG AGTGTTGTAA TCAATCCATT 2040 

ATAATGTAAT TGACTTTTGT AATTTGC C AA TAGGAGTGTT AAACAACAAA ATGATTTAAA 2100 

ATGAAAC TTA ATGTATTT'rC ATTTXAAATA TTAACTAAAC C AAGTTTGT 1 ? TGTTAGTTAT 2160 

TCTAGCCAAT AAGAAAAGAG AATGTAGCAT CCTAGAGGTG TATTTG'TTCT GCAGTTTGGC 2220 

AGGACCGTCA G'H'AGTCCAA ATAAACATCC CCTCAOCGTG GAGGCGAATG GAACCTGTGC 2280 

TCCTTTCTTA CGGGAAGCTT TGCAAAGCAA AATAGCAGGG TTACAAGCTT GGASTTGTTA 2340 

AGGCAACT AG AGTTTTCTCT ATTAATTTAT AGACTGTTGT TGCACCTACT TAGCTCTTTT 2400 

TTGGGAACTC TAGTTCCCAG GGGAAAATAC CTCGTGCC 2438 

(2) INFORMATION FOH SSQ ID 110:44: 

U) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 542 amino acids 
<B> TYPE- amino acid 
<C> STRANDEDNBSSs Single 
(D) TOPOLOGY: linear 

(ii! MOLECULE TYPE: DHA 

{Xi } SBQORHCB DESCRIPTION : SSQ ID KO:44: 

Ser Gly Gly G.ly Pro Trp Arg Ala Gly Gly Gly See Gly Lys Ser Asp 
1 5 10 IS 
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Ser Gly Leu Thr Val Glu Pro Gly Arg Gly Leu Thr Ala Arg Pro Pro 
20 25 30 

Pro Gly Gly ser Arg Thr Arg Ser Gly Ser Gly Arg Ala Ser Leu Pro 
35 40 45 

Arg Leu Ser Glu Arg Arg Val Met Ala Val Val Mat Ala Ala Gly Ala 
50 55 60 

Arg Thr Ala Pro Leu Glu Leu Ser Ser Glu Arg Ser Val Gin Lys Val 
65 70 7 5 80 

Pro Arg Arg Asa The Leu Leu Glu Lys Leu Lys Asn Thr Xaa ?he lie 
85 90 95 

Tor Leu Glu He val Lys Asn Leu Phe Lys Met Ala Glu Asa Asn Ser 
100 105 110 

Lys Asn Val Asp Val Arg Pro Lys Thr Ser Arg Ser Arg Ser Ala Asp 
115 120 125 

Arg Lys Asp Gly Tyr Val Trp Ser Gly Lys Lys Leu Ser Trp Ser Lys 
130 135 140 

Lys Ser Glu Ser Cys Ser Glu Ser Glu Ala lie Gly Thr Val Glu Asn 
145 150 155 160 

Val Glu lie Pro Leu Arg Ser Gin Glu Arg Gin Leu Ser Cys Ser Ser 
165 170 175 

lie Glu Leu Asp Leu Asp His; Ser Cys Gly His Arg Phe Leu Gly Arg 
180 185 190 

Ser Leu Lys Gin Lys Leu Gin Asp Ala Val Gly Gin Cys Phe Pro lie 
195 200 205 

Lys Asn Cys Ser Gly Arg His Ser Pro Gly Leu Pro Ser Lys Arc Lys 
210 215 220 

lie His lie Ser Glu Leu Met Leu Asp Lys Cys Pro Phe Pro Pro Arg 
225 230 235 240 

Ser Asp Leu Ala Phe Arg Trp Kis Phe lie Lys Arg His Thr Val Pro 
245 25C 255 
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Met Ser Pro Asn Ser Asp Glu Trp Val Ser Ala Asp Leu Ser Glu Arg 
260 265 270 

Lys h&v. Arg Asp Ala Gin Leu Lys Arg Arg Asn Thr Glu Asp Asp lie 
275 280 285 

Fro Cys Phe Ser His Thr Asn Gly Gin Pro Cys Val Ik Thr Ala Asn 
290 295 300 

Ser Ala Ssr Cys Thr Gly Gly His lie Thr Gly Ser Met Met Asn Leu 
305 310 315 320 

Val Thr Asn Asn Ser lis Glu Asp Ser Asp Met Asp Ser Glu Asp Glu 
325 330 335 

He lie Thr Leu Cys Thr Ser Ser Arg Lys Arg Asn Lys Pro Arg Trp 
3 40 345 3 SO 

Glu Met Glu Glu Glu lie Lei; Gin Leu Glu Ala Pro Pro Lys Phe His 
3 55 360 3S5 

Thr Gin lie Asp Tyr Val ifi,*s Cys Leu Val Pro Asp Lea Leu Gin He 
370 375 380 

Ser Asn Asn Pro Cys Tyr Trp Gly Val Met Asp Lys Tyr Ala Ala Glu 
38S 390 395 400 

Ala Leu Leu Glu Gly Lys Pre Glu Gly Thr Phe Leu Leu Arg Asp Ser 
405 410 415 

Ala Glr: Glu Asp Tyr Leu Phe Ser Val Ser Phe Arg Arg Tyr ser Arg 
420 425 430 

Ser Leu Bis Ala Arg lie Glu Gin Trp Asn His Asn Phe Ser Phe Asp 
435 440 445 

Ala His Asp Pro Cys Val Phe His Ser Pro Asp He Thr Gly Leu Leu 
450 455 460 

Glu His Tyr Lys Asp Pro Ser Ala Cys Met Phe Phe Glu Pro Leu Leu 
465 470 475 480 

Ser Thr Pro Leu lie Arg Thr Phe Pro Phe Ser Leu Gin Mis lie Cys 
485 490 495 
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Arg Thr Val lie Cys Asn Cys Thr Thr Tyr Asp Gly lie Asp Ala Leu 
500 505 510 

Pro He Pro S«r Pro Met Lys Leu Tyr Lc-u t>ys Glu Tyr His Tyr Lys 
515 520 525 

Ser Lys Val Arg Leu Leu Arg lie Asp Val Pro Glu Gin Gin 
530 535 540 

(2) INFORMATION FOE ESQ 10 NO: 45: 

U) SEQUENCE CHARACTERISTICS : 

(Al LENGTH; 4399 base pairs 
(8! TYPE; nucleic acia 
(C) STEAKDEDN33S: sing la 
iO) TOPOLOGY; linear 

Cxi} MOLECULE TYPE; OKA 

(XX) SEQUENCE DESCRIPTION: SEQ ID HO:d5: 

CCCTCTGGGC AAGCCGCCCC CCCCCCACCC ATCTACCACA CACACACACA CACACACACA SO 

CACACATTCA GACCTTGGGG CAAAAACAAA GCAAAATAAC AACAACAAAA AC AC TGC'CTG 120 

TGGAAAGTCC TTACTTCAGG AAGGTTGGCA GATGAGGAGC AAGGGAACAT TTTATCAGGA 180 

CTGCCACAAA GGAGTCTTTT TTTTTAATGG TTTTTCAAGA CAGGGTTTCT CTGTATAGCC 240 

CTGGCTGTCC TGGAC-CTCAC TTTGTAGACC AGGCTGGCCT CGAACTCAGA AATTCOCCTG 300 

CCTCTGCCTC CTGAGTGCTG GGATTAAAGG CGTGCAGCAC CA.TGTCCAAC TGGCATTTTC 360 

TCAATTAAGG TTCGTTCCTT TCAGATAACT CTAGGTTCTG GGTCAAGCTG AC ACAAGGC T 420 

ACACAGCACA GTTTGTATGC CACATTCAGT TCAGAAGACA CCCAACCTCC CTGGAACTGG 480 

AACT'TATGCA CATTTGTGAG CTTCCACTTG GGAGTGGGAA CCTGAACTGG GTCCTCTGCA 540 

AGAGCAGCCG TGCTCTTAAC TGCTGAGCCA TTTCAGCAGC CTCACATCAG AATTAAGTTA 600 
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GAAATTAGCCG GGTATGAATC ATACCCTTAG AATCCTAGCA TCTGAAAGCA GAGCTAAGAG 
660 

AAACAGGGAT TCAAGACCAG CTCT'i'GGCTA CAGAGCCCGT CCTGTCC33M3 GATGGGCTAC 72 0 

AAGAGACTAT TTCAAAGCCA TCCAAACAAC AATAACTACA ACAACAACAA GGTTAAAATT 780 

AGGCTGGGCA CAGGGTACAC ACCTTTAATG CCAACACTCA GGAGGCAGAG GCAGGCTGAT 840 

CAGTGTGAGT TTCAGTTCAA CGTGGTCTAC ATAGGGAGTT CTAGGCCAGC AGAGGTTACA 900 

GTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCACACA CACACACACA CACACACACA 950 

CACACACACA CACACACGGT GGCATTATGG GATTTTTTTG GGATAAGGTT TCTCTGTCTA 1020 

GCCCTGGCAT AGATTCACTC TGTAGACTAG GCTAGCCTTG AACTCAGAGA TCCGCCTGCC 1080 

TCTGCCTCCC AAGTGCTGGG ATTATAGGTG TTGCACCACC ACTGCCCAGC C AC TTTGGG A 1140 

TTTTTGAACT GTTATCMGA. GGCTTTCGAG GAOGTCAAAC TTCAACAOCA ACCTCTCCAT 1200 

GATAATGTAG CTAATGATCA AACGACACTC AAAACTTAAC CCTTAAAGCA CACATCCACC 12 60 

AGACAGCGTG CCCACTCGTA GTTCCAT'TAC TCAGGAGGCT GAAGCAGGAG GATGAAGGAC 1320 

TAAGGCTTCA GCAACCTAGG GAGCCGCAGG GGACAGTAGT CTCAATCCCT ACATTCTCCT 1330 

GAACACAGGA GCAGGAGTTC AGGAAGGGTG TCAAGGCCGC TTACTGATCT TAGGGCCTCA 1440 

GGAATGACTA GCTCAGGCAG AGAGAGCAAA GGTCTCCAGT GGAGAAGTCT ACACACACAC 1500 

ACACACACAC ACACACACAC ACACACACAC AGAATCCAAG GCGATGACGT CATCAAAGGG 1560 

TTAATTCTAG TCTGGGATGG GGGGGAGGGT GGGGCACGCA GCTGTCAGGT GGCTTTGGAA 162 0 

AAATAAAC'TG CTGAAGAGTC TGACGCCAQG GAGTCCTGGG AGGGACAAGA GGTTACCCAC I68Q 

TC AAAGAG TG TGCTCCACAA AGCATGCGCG CTTGTCCACG TCTGGAGTCG TCACTPATTT 1740 

TTTGCCTGGA TTCTTTGTAG CCGGTGGG'PT CTCAAGGCGG TAAGTGGTGT GGCCGCCGTG 1800 

GTCTGGGAGG TGACGATAGG GTTAATCGTC CACAGAGCCC AGGGGCGGAG CGCGGGCGGG 1860 

CGTCCGCAGC CCCGCTGGAG CCGGAAGCAG TGGCTGGTCA GGGGCGCTTC TAGCCTTCCC 1920 
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TATCTGTAC? TCCACAGAGG TCTCTGCGAG CTAGGOOGAC AGTGAGGTGC GGGGTAGGGG IS SO 

CCCGGCGTTA GAGCCAGCA& GGGGACGCTIT CAGGGTAAGG TCTGAGGGAG AGAGAGCTCC 2 040 

TGAGAAACTT GGGGGGCGCG ACACAGATAG GGTGAAAGCA GAGTGATAGA CCTGGGATGG 2100 

TTAGGGGACC AAGGGAAGAC CAGGCTGGTT GGCATACACC SGTGAACGGA TGGGAGTCCT 2160 

AGGGAAAGA? GATGCGCCTA ACAGTCCTTT CTOTCTOCAC AGCACTCCAG GGGACGATCC 2220 

QGAGCTC A AC TTTCAAAAGC GAGACGCCCC AGCAAGCCTG TTTTGAGAAG TTCTTCAGCG 2280 

GC-TCTCCTCA TGGGCCAGAC QGCCCTGGCA AGGGGCAGCA GCAGCACCCC TACCTCGCAG 234 0 

GCTCTGTAC'X' CGGACTTCTC TCCTCCCGAG GGCTTGGAGG AGCTCCTGTC TGCTCCCCCT 2400 

CCTGACCTGG TTGCCCAACG GCACCACGGC TGGAACCCCA AGGATTGCTC CGAGAACATC 3460 

GATGTCAAGG AAGGGGGTCT GTGCTTTGAG CGGCGCCCYG TGGCCCAGAG CACTGATGGA 2520 

GTCCGGGGGA AACGGGGCTA TTCGAGAGGT CTGCACGCCT GGGAOATCAG CTGGCCCCTG 2580 

GAGCAAAGGG GCACACACGC CGTGGTGGGC GTGGCCACCG CCCTCGCCCC GCTGCAGGCT 2640 

GACCACTATG CGGCGCTTTT GGGCAGCAAC AGCGAGTCCT GGGGCTGGGA TATTGGGCGG 2700 

GGAAAATTGT ATCATCAGAG TAAGGGCCTC GAGGCCCCCC AGTATCCAGC TGGACC'TCAG 2760 

GGTGAGCAGC TAGTGGTGCC AGAGAGACTG CTGGTGGT7C TGGACATGGA GGAGGGGACT 232 0 

CTTGGCTACT CTATTGGGGG CACGTACCTG GGACCAGCCT TCCGTGGACT GAAGGGGAGG 2880 

ACCCTCTATC CCTCTGTAAG TGCTGTTTGG GGCCAGTGCC AGGTCCGCAT CCGCTACATG 2240 

GGCGAAAGAA GAGGTGAGAT ACGGACTAGG TGTGGGGAGA TCACTACTCT TGGCAATC3? 3000 

TTGGGC TGGA AACTCAT'GGT TGGAGCACAG GAAGTAGGCT TCTTGTCACT TTGGCCTGTC 3060 

AC'TTAGATGG CCTTGGATC'T AGCTTCftCTC CCAATCCCTA TTGGATGTGA TGCACAAATT 3120 

C AG AGCCTTT GGGTCTCC C'T CAGCT6AGGT GGCGGTGGAA ATGGAGGAAG AAGGA&GGG? 3180 

GCCTGAGCAG GATCTCAAGT TCAAGGATGC CTGGAGTTGC TTACTTACC'I' TGTCTTCCTI' 3240 

C-TCTCTCCGC AGTGGAGGAA CCACAATCCC TTC?GCACCT GAGCCGCCTG TGTGTGCGCC 3300 
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ATGCTCTGGG C5GACACCCGG CTGGGTCAAA TATCCACTCT GCCTTTOCCC CCTOCCATGA 3360 

AGCGCTATC? GCrCTACAAA TGACCCAGTA GTAC1U3GGTO TGCTGGCACC CTACCGTGGG 3420 

GACAGGTGGA GAGGCACCCG CTOGCC?A<3& CAACTTTAAA AAGCTGGTGA AGCTGGGGGG 3480 

GGGGGGCTGG ACCCCTTCAC CTCCCCTTCT CACAGGAGCA AGACATATAG AAATGATATT 3540 

AAACACCATG GCAGCCTGGG ACAAAGAGGT T2TJX&M3&K AAAAATGAGA TGTATTGTCA 3600 

CAACCTGTTT CATTATTGT? TTT^GTTTTG TTTTACACTG CCCCACCCCA GGCTAGAGCC 36 50 

C C A TC AC TOST CTTAAGGAAT TATGACAACC CACAAAGCTC AGGCCCAGG'l* GTTTATTTCC 3720 

CTTACATGTA GGATGGTTCA CAAACACAAT ACAGGGGCTT TGGCACCGTG GGGGAGGGGA 3780 

CTATCCCAGG CCTCTTAGGG TCTCATGTAT ACCGAATTCA GACCCGAAAG CTCTGAATTT 3840 

CTGCATCAGA CATC C AG TAG AACTTGGGAG TGAAGCTAGA GCCAAGGCCA TCTAAGTGAC 3900 

AGGCCAAAGT GACACGAAGC CCACTTCCTG TGCTCCAACC ATG AG TTTCC AGCCCAAACC 3960 

AATGGAAGGT GATTTCACTT GTCAGGGCCC AAAGGGACAG TCAGTTCTAC TCCCTCCCCT 402 0 

CACTAGGAGC CACCTTGGTG ACAGTTGATT CTACCCACTG TAAGTGGTAA AGGGATTGGC 4080 

CTOOTCCCAA CCATAATAGG GCGGTGGAAA CGGCTCAGGA GGGTACAGCG TGGATTAGGC 4140 

CACAAGATGG 6GCAGATGAT GTCATCAGAA GCATGTGACC GGTGGGAGCA GTTACTAAAC 42 00 

TTCTGGGCAA CCTAGTCCAT GCTATGCAGG CAGGTAGAGG GATGGGCAGT GCTCATTGTT 4260 

TGOCATTGAT GATGTCCACA AATTCAGGCT TGAGAGATGC GCCACCCACA AGGAAGCCGT 4320 

C C AC G TCAGG CTGGCTTGCC AGCTCTTTOC AGGTTGCTCC AGTCACAGAA CC'TGTACCAG 4380 

GAACAAGAAG ACAGTTTGGT CAGGTCTATG ATCAGAACAC TTAAGCCCCA CCTCTCTGTG 4440 

CAAGGCAGCC TCAGTCTGTC TTAGCCCATT TCCGTCTTAG OTAGAGCCAA AGCCACTCAC 4500 

CTCCATAAAT GATCCGGGTG CTCTGAGCCA CCCCATCATT GACATTGGA? TTCAGCCATC 4560 

CCCGGAGCTT CTCGTGTACT TCCTGTGCCT AGAAGGAGGA GGCAGAGCTA CTAAGTAAGC 4620 

TCCUPCCE&T CTATCATTCA &OSACREAAAA ACCACTGGTT OTCACATAGA GTTGAGTTTC 4680 
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CAGAAAAGCC CCGGGACCAG AGAGTGGCAA OOCTCCJVATC CCACCAGGCT TGGAATGAAC 4740 

ATTTTTGGCA AAGTCACTCT CCTTGGTGAG TTTGGGGGCC CTCTOTCTCT AAAGGGGCT? 4800 

GGATGGGCTC CATAGCTGTG TGAGTCTGT'T AAAGCCGGAC AGGCTGAGGA SCTC TGGGTA 4360 

GTTACC TGC'f GAGGGGTTGC CGTCTTGCCA GTCCCAATGG CCCACACAGG TTCATAGGCC 492 0 

AGGACCACCT TGCTCCAG'i'C TTTCACATTA TCTGTOGGGC AGAGAGGAGA GTGAGTAGGA 4980 

AGGAGCTGAC CCGCCAAGC 4399 

(2) INFORMATION FOR SEQ ID NO: 46: 

U) SSQUSKCS CHARACTERISTICS : 

(Ai length ; 264 amino acids 
(3) TYPE; aasiuci acid 

(C) STKANOSDNESS: single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: ONA 

Ixii SEQUENCE DESCRIPTION : SKQ ID NO; 46: 

Met Gly Gin Thr Ala Leu Ala Arg Gly Ser Ser Ser Thr Pro Thr Ser 



1 



10 



15 




Ser Ala Pro Pro Pro Asp Leu Val Ala Gin Arg His His Gly Trp 
35 40 45 



Pro Lys Asp Cys Ser Gin Asa. lis Asp Val Lys Glu Gly Gly Leu 
50 55 60 



Cys 
65 





Glu Gin Arg Gly Thr His Ala Val Val Gly Val Ala Thr Ala Leu 
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100 105 1X0 

Ala Pro Leu Gin Ala Asp His Tyr Ala Ala Lew Leu Gly Ser Asti Ser 
115 120 125 

Glu Ser Trp Gly Trp Asp lie <31y Arg Gly Lys Leu Tyr His Gin Ser 
130 135 140 

Lys Gly Leu Glu Ala Pro Gin Tyr Pro Ala Gly Pro Gin Gly Glu Gin 
145 150 155 160 

Leu Val Val "Pro Gin Arg Leu Leu Val Val Leu Asp Met Glu Glu Gly 
165 170 175 

Thr Leu Gly Tyr Ser He Gly Gly Thr Tyr Leu Gly Pro Ala Phe Arg 
ISO .1.85 190 

Gly Leu Lys Gly Arg Thr Leu Tyr Pro Ser Val Ser Ala Val Trp Gly 
135 200 205 

Gin Cys Gin Val Arc; lie Arg Tyr Met Gly Gl« Arg Arg Val Glu Glu 
210 215 220 

Pro Gin Ser Lev Leu His Leu Ser Arg Leu Cys Val Arg His Ala Leu 
225 230 235 240 

Gly Asp Thr Arg Leu Gly Gin lie Ser Thr Leu Pro Leu Pro Pro Ala 
245 250 255 

Met Lys Arg Tyr Leu Leu Tyr Lys 
260 



(2) INFORMATION FOR SEQ ID NO:47: 

i i ) SEQUENCE CHARACTERISTICS : 

(A! LENGTH: 5615 base pairs 
{'&} TYPE; nucleic acicl 
[C] 3TRMJDEDBIES.9 : single 
<©) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DHA 
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(XX ) SSQUSSfCE DESCRIPTION; SEQ ID HOr47: 

GTACTTTCTT TATATCTCCA TAATTTTAT? TACTATTACT ACATGATACA TTATTTTATA 60 

A A AGTC TTTG TAACCTCCTT AAGGATTCAC TGCTTAATC? CCAGTGCTTA GCACAAATCA 120 

TTAAATGCGA ACCAGAAACT CTTCCAAATG TGTTACATCT ATAACCTCAT TGGATTCTCA 180 

CTACCAACCC CATGCAATAG ATACTAATGT GATCTCTGTC TTACAGAGGA AGAAACAGGC 240 

AC AGGG AGG T 1'CAG'TAATTT GCCCAAGGTC ATACACACAC TGGCCSTCAG GTA TTCATSC 300 

CCGGGGAGTC TGGTCCCACA GCTGGCATGT TTGCCATTA? ATTATATTGC CTCCTTATAG 360 

TCTCGGCACT CATTAAGCAC ATTGACAGCT ATGCTTGGTS AGTGACTACT ATGTACCCAG 430 

CTCTGTGCTA CATGCTTTAC CTGGATTATT TCAACTGC&C AACAACCCTG TGAGGTAACT 480 

ACCATCATTG CTCCTATTTT ACATAACAGA AAACTACAGA AAXCTCGGGC TGGGCGTAGT 540 

GGCTCATGCC TGAAATCCCA GCACT'i'TGGG AGACCCTGTC TCTAAAAAAA ATTTTTTTTT 600 

GGCCGGACGT GGTGGCTCAC ACCTGTAATC TCAGCACTTT GGGAGGCTAA GGCAGGCAGA 660 

TCACAAGGTC AGGAGTTCTA GACCAGCCTG GCCAACATGG CAAAACCCTG TGTCTACTAA 720 

AAATACAAAA AATAGCTAGG CGTGGTGGCA GGTGCCTGTA ATCCCAGCTA CTCAGGAGGC 780 

TGAGGCAGGA GAATCCCCTG AACCTGGGAG ATGGAGGTTA CAOAGAGCCG AGATCGTGCC 840 

GCTGCACTCC AGCCTGGGCA ACAAGAGCAA GACTCTGTCT CGAAAAAAAT AAAAATAAAA 900 

ATAAAAATAT TTTTTTAAAA ATTAGCTGGG TGTGGTAGCA CATGCCTGTA GTCCCAGCTA 960 

CTTGGGAGGC TGAGGTAGGA GGATCACTTG AGCCCAGGAG GTCAAGGCTG CAGTGGGCT3 1020 

TGATOGCGCC ACTGCACTCT AGCCTTGGTG ACAGCAAGAC CCTGTCTCAA AAAAAAAAAA 1080 

AAGAGAAATC GGGCAACTTC CCCAAGATCG CGCAGTTAAC TAGTGGCATA GCTTCACTCA 1140 

AACTCGAAGT CTTAATCAGG ACACTCTACC AAATGAGA'TC AACGGC T'C AG TAATGGATTG 1200 

GCA TCCAGTA TGAAGACTGG ACCA6CAGGG AGAACTATGA 2GCSTACAGC CTAGAQCCTG 1260 

AAGCAGATTT CACAGCCTCA GAOGTGGCAC AGGCTGACTC ACAACCCGGG GCAGAAAGGG 1320 
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ACCAGCCCAG AAACAGTOAC CCAGAATCAC AGGGAAGTAG AAATGGGATT CGGCACAATG 13 80 

AAGCCCCTCC TTGACCCGAT GCTCCTTACC CTCAGGGGCG CAGGAGTTAG TCGCTCAGGC 1440 

GGCTCAAAGG TCTTGACGGT GGAGAACACC ATCCCCAGGG ATTCCCGACG CGGTGATGCC 1500 

ATCAAAGCGT TAATTCTGAG ATGGGCCTGC CCGGGTGCGG ACTCTGCCGC SGCAAGAGAA 1560 

GGGTTAACTG CCCCGGGCCT TCGCCGTQG8 GGCGGGGCCT CGGGGAGGGT CACAGCCCGG 162 0 

GACTOAGACC CGAGGTTAAC CGCCCGGGGT GGGCTCCACG GGGGCGGGGC ATGCTCTCCG 1680 

CGGCTGCTGC CGGTATAGAG CGGTAACTGC CCAGGAGGGG GCCK3QGCCCC ACAGGGGCGT 1740 

GGCCTCGGAG C7GCACGGCC GTGGGCGGCG ATGAGAGGGT TAAGCCCCAG AGGGCCCTGG 1800 

AGGGGCGGGG CCGCGGGACG GGCTCGGCCC AAGGGAGGAG CTGGGGGCGG AAGCGGCCGG I860 

CGGTCTGCGC CCTGCGCGCC TCGGCTi'CTT TCGGCCCGGC TCCTTCAGAG GCCCGGCGAC 1920 

CTCCAGGGCT GGGAAGTCAA CCGAGGTTCG GGGGCAGCGG CGAGGGCTCC GGGCGAGTAA 1980 

GGGGGATGGT CCATGCTGAG GCCCAAATGG GGCGAACTCG CGAGAGTCTC TGGCGACCTG 2040 

GATCAGATGG GGCGAGGGCA GATGAAGGGC CCAGGAGCTT TGGGGCAGCG AGGAGGGAGG 2100 

AGCGGGCCCG TTGGCAAACT TGGGTGAAAG GATGGGGTAC CTGGG'i'G AC G AGCCCCCGCC 2 ISO 

AGGATTCTGC TCTTCACGCC CCTT'TTCTCC CAGCTCCCTT CCAGGTCAAT CCAAACTGGA 2220 

GC-TCAACTTT CAGAAGAQAA AGACGCCCCA GCAAGCCTCT TTCGGGGAGT CCTCTAGCTC 2280 

CTCACCTCCA TGGGCCAGAC AGCTCTGGCA GGGGGCAGCA GCAGCACCCC CACGCCACAG 2340 

GCCCTGTACC CTGACCTC'i'C CTGTCCCGAG GGCTTGGAAG AGCTGCTGTC TGCACCCCCT 2400 

CCTGACCSGG GGGCCCAGCG GCGCCACGGT TGGAACCCCA AAGACTGTTC AGAGAACATC 2460 

GAGGTCAAGG AAGGAGGGTT GTACTTTGAG CGGCGGCCCG TGGCCCAGAG CACTGATGGG 2520 

GCCCGGGGTA AGAGGGGCTA TTCAAGGGGC CTGCACGCCT GGGAGATCAG CTGGCCCCTA 2580 

GAGCAGAGGG GCACGCATGC CGTGGTGGGC GTGGCCACGG CCCTCGCCCC GCTGCAGACT 2640 

GACCACTACG CGGCGCTGCT GGOCAGCAAC AGCGAGTGGT GGGGCTGGGA CATCGGGCGG 2700 
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GGGAAGCTGT ACCATCAGAG CAAGGGOCCC GGAGCCCCCC AGTftTCCAGC GGGAACTCAG 2760 

GGT3AGC AG C TGGAGGTGCC AGAGAGACTG CTGGTGGTTC TSGAOWOGA CKSAGGGRftCT 2820 

CTGGGCTACG CTATTGGGGG CACCTACCTG GQGCCAGCAT TCCGCGGACT GAAGGGCAGG 2880 

ACCCTCTATC CGGCAGTAAG CGCTGTCTGG GGCCAGTGCC AGGTCCGCA'T CCOCTACCTO 2940 

GGCGAAAGGA GAGGTGAGCC CTGGGGCAGA CGTGGGGAGA ACTTTCTGTC CCTGGTGGCA 3000 

GI'GGTTTGGG ATGGAAACTC TTCTGACAAG AGCACAGGGG ATGGACCTTC ATCCAGCCTG 3060 

CCTCAACCTC TGTTCAGTGC TGGGAAAGGC TAGGGGTCTT CACAGCTOTT ATTTAATTTA 3120 

ACCCAACAGC AATAGAGGTG AAACAGGCTT GAGAAAGCAA CTTTCTCAAG TTCTCTTGGC 3180 

CAGTAAATGG TGAACCTTCA GAATGGAGGG AGGAACTGCA GGGATGAGAG AATTCAGGAG 3240 

ATATCAACCC CTGAGCAAGA GGTGCAAAGC GTTAGGTACT GGGTTTCATG TACAGGTCCA 3300 

AAAGAAGGAT GGGCAGAGCC AGGTACCCAG GCTG7ATACC GGATTCCCTG GGC7CTAACC 3360 

TGTCTCTGTG CCACATACCT ACTTCCTTCC TCAGCCACAC CTCTGGATGG AGACACTGGG 3420 

QCCCTGGGCA CCAGGGAGGA GAGCAGTGGA GGAGGCAGGG CCTTAGGGTG GGGCAGCAGG 3480 

GGAGGAGCCT CCCCAGSAAC T6ACT0GGTC CAGGGCTTGG AGCTGCTCTC TGCAGTTGTG 3540 

TGGGCTGTAG AGTGGAGGGC CATCCCTCCT CACCTCAGCC CCAGCTCCCA AGCCTCTGGA 3600 

GTCAAAGCC? GGGCCAGCTC CACCACTOTC AGAGCCACCT TGGCCTGTTG TTTAGAGGGC 3660 

CTTAGCCAGC TCTTCACCCC CAGCTCTGAC TAGGGATGTG TGAAATCTTA TCTGGGAGGC 3720 

AGAACTTCCG GGTATCTCAA ATTCCCCTTT CAGCCAGGTG GGCACACTCG AAGCAGGAAA 3780 

GCAGAAAGGC ATCTGAGTAG GACCCCGTAG TTTGAGGACA TCTGGCTGGT GGCTGCACCC 3 840 

AT AC 'T TAG AT TCCCCTCCTT CTCTCTCCCA GCGGAGCCAC ACTCCCTTCT GCACCW3AGC 3900 

CGCCTGTGTG TGCGCCACAA CCTGGGQGAT ACCCGGCTCG GCCAGGTGTC TGCCCTGCCC 3960 

TTGCCCCCTG CCATGAAGCG CTACCT<3CTC TACOAGTGAG CCCTGTGATA CCACAGACTG 4020 

TGCTGAGGTC TTGCCACCAC CCCTCCCCTT GGGGAGGTGG GGAGGCACTG OTGGCCTAGA 4080 
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CCAGCTGCTG AAAGCTGGTG AGGCTGAGCC CCTACCCCAA CCCAAGCTCT GCGGAAATCA 4140 

ACAGCCCCAG AGCCACTTGG AGGGAGGAAG AAAGGGAGCC 6GCGTTCAAG GCTATGACAG 4200 

TCTGCTACGC AAAACATTTT TXCAAGTAAA AATAGTAAGA GATGTTGTTA TAGAAACCfG 4260 

TTCTTGTTTT TTT7TTTTTC TTGCACAAAT GATCATTTAT ATAGCTGCCT CAAAAAGGAA 432 0 

GATTATCTGG GCAAGTCCAG TGAAGGCAGA CAAACCACAA GACCTAGTGC CAGGTTTATT 4380 

CCCTCACATG GGTGGT7CAC ATACACAGCA CAGAGGCACG GGCACCATGG GAGAGGGCAG 4440 

CACTCCW3CC TTC'TGAGGGG ATCTTGGCCT CACOGTGTAA 6AAGGGAGAG GATGGTTTCT 4500 

CTTCTGGCCT CACTAGGGCC TAGGGAACCC AGGAGCAAAT CCCACCACGC CTTCCATCTC 4550 

TCAGCCAAGG AGAAGCCACC 'i'TGGTGACGT Tl'AGTTCCAA CC&TTATAGf AAGTGGAGAA 4620 

GGGATTGGCC TGGTCCCAAC CATTACAGGG TGAAGATATA AACAGTAAAG GAAGATACAG 4580 

TTTGGATGAG GCCACAGGAA GGAGCAGATG ACACCATCAG AAGCATATGC AGGGAAAGGG 4740 

CAGTTACTGG GCTTCTGGGC TGCTTAGTCC CTGGCTTGGG AGGAAGGGTA GGGAAGATGG 4800 

ATGGGGCTCA TTGTTTGGCA TTGATGATG? CCACG&ATTC GGGCTTGAGG GAAGCACCAC 4860 

CCACAAOGAA GCCATCCACA TCAGGCTGGC TGGCCAGCTC CTTGCAGGTT GCCCCAGTCA 4950 

CAGAGCCTGG G AAGGGAGC A GAACAAGGGC TTGGTCAAGA ATGGGATGAG TCTGCCCCA.T 4980 

CCCCACCTCC A7GTCCGAGG GCTCAGTCTA GTCCTCAGCC CACTCCAGCT CAGCCGGGAA 5040 

CCAAAGCCAC TCACCTCCAT AAATGATACG GGTGCTCTGA GCCACCGCAT CAGAGACGT? 5100 

GG AC TTCAGC CATC CTCGG A GCTTCTCGTG TACTTCCTGG GCCTAGAACA AGAAGCTGGC 5160 

CTAAGTAAGA CCTTTTCTGC CTCTCTAAGA GGAAAA&TCA CTGGCACCAG TGGACACTTA 522 0 

GTGTGGTTTC TGACTGAGTC AGAGTACCAG GGCTCTGATC CAAGCCAGGC CC TGGAC TOG 52 80 

ATGCCCTTGG ACAAGTCACT GTCTCTGGGT TCAAGGTCTC 1GTGTCTTTG AAATAAGGGG 5340 

TTGCCCCATG TGGGCTGTGT CTGTCCAAAC CTATTGAGGC AGGCTGGGAT GAGGGCAGGG M00 

CTCCTGGGCC CGGTTACCTG TTGGGGTGTT GCAGI'CTTGC CAGTACCAAT GGCCCACACA 5460 
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GGCTC AT AGG CCAGGACGAC CTTGCTCCSS TCCTTCACGT TATCTGC&GG GCAGAGATAC 5520 

AGATGGAGGG AAGGGTGAAC AAGAAAGAGC TCTCCAGCCA GOTTCTCCGG AGTACGAAGA 5580 

ACGGTGGCCT ACTGCCCCCT AGTGGACATT GGGGG 5615 

(2) INFORMATION FOB. SEQ ID JiOi48: 

<i> SEQUENCE CHARACTERISTICS: 

{A} LEBGTH: 263 amino acids 

(B) TYPE: amino acid 

(C) STRAHDED&ESS : single 
(»> TOPOLOGY: linear 

(ii) MOLECULE TYPE: OKA 

(x.i) SEQUENCE DESCRIPTION: S8Q ID NO (48; 

Met. Gly Gin Wit Ala Leu Ala Gly Gly Ser Ser Ser Thr Pro Thr Pro 
IS 10 15 

Gin Ala Leu Tyr Pro Asp Leu Ser Cys Pro Glu Gly Leu G.lu Glu Lev; 
20 25 30 

Leu Ser Ala Pro Pre Pro Asp Leu Gly Ala Gin Arg Arg His Gly Trp 
35 40 45 

Asn Pro Lys Asp Cys Ser Glu Asn lie Glu Val Lys Glu Gly Gly Leu 
50 55 €0 

Tyr Phe Glu Arg Arg Pro Val Ala Gin Ser Thr Asp Gly Ala Arg Gly 
65 70 75 80 

lys Arg Gly Tyr Ser Arg Gly Leu His Ala Trp Glu He Ser Trp Pro 
35 90 95 

Leu Glu Gin Arg Gly Thr His Ala Val Val Gly Val Ala Thr Ala Leu 
100 105 110 

Ala Pro Leu Gin Thr Asp His Tyr Ala Ala Leu Leu Gly Ser Asn Ser 
115 120 125 
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G.Ui Ser Trp Gly Trp Asp lie Qly Arg Gly Lys Leu Tyr His Gin Ser 
130 135 140 

Lys Gly Pro Gly Ala Pro Gin Tyr Pro Ala Gly Thr Gin Gly Giu Gin 
145 150 155 160 

Leu Gly Val Pro Glu Arg Levi l«eu Vatl Val Leu Asp Met Glut Giy Gly 
165 170 175 

Thr Leu Giy Tyr Ala lie Gly Gly Thr Tyr Leu Gly Pro Ala Phe Arg 
180 185 190 

Gly Leu Lys Gly Arg Thr Leo Tyr Pro Ala Val Ser Ala Val Trp Qly 
195 200 205 

Gin Cyss Gin Val Are He Arg Tyr Leu Gly Glu Arg Arg Ala Glu Pro 
210 215 220 

Mis Ser Ley Leu His Leu Ser Arg Leu Cys Val Arg His Asa, Leu Gly 
225 230 235 240 

Asp Thr Arg Leu Gly Gin Val Ser Ala Leu Pro Leu Pro Pro Ala Met 
245 250 255 

Lys Arc Tyr Leu Leu Tyr Gin 
260 

(2f INFORMATION FOR SiSQ ID 1*0:49: 

■I ii SEQUENCE CHARACTERISTICS : 
{A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDBDBES3 ; single 

( D ) TO PQLOGY : 1 ine&r 

(ii> MOLECULE TYPE: DNA 

Ui) SSQUEMCS DESCRIPTIONS SEQ ID SO: 49: 
AG CTAG ATCT GGACCCTACA A1X3GCAGC 

(2! INFORMATION FOR SEQ ID !SfO:50; 
(i) SEQUENCE CHARACTERISTICS: 
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{A> base pairs 

{B> TYPE: nucleic acid 
(C> STSAHDE039ESS : single 
i 3D 5 TOPOLOGY : linear 

t'ii} MOLSCWLS TYPE; DMA 

(xi j SEQUENCE DESCRIPTION; SEQ IS> NO; 50: 
AGCTAGATC'T GCCATCCTAC TCGMSOGGCC ASCTGG 36 
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CLAIMS: 

1 - A nucleic acid molecute comprising a sequence of nucleotides enco ding or complementary 
to a sequence encoding a protein or a derivative, homologue, analogue or mimetic thereof or a 
nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42° C 
wherein said protein comprises a SOCS box in its C-terramal region. 

2. A nucleic acid molecule according to claim 3 wherein the protein further comprises a 
proteimmolecule interacting region, 

3. A nucleic acid molecule according to claim 3 wherein the proteimmolecule interacting 
region is located in a region N- terminal of the SOCS box. 

4. A nucleic acid molecule according to claim 2 or 3 wherein the protein: molecule 
Interacting region is a proteimDNA binding region or a protein :protein binding region. 

5. A nucleic acid molecule according to claim 4 wherein the protem:molecule interacting 
region is one or more of an SH2 domain, WD-40 repeats or ankyrin repeats. 

6. A nucleic acid molecule according to any one of claims 1-5 wherein the SOCS box 
comprises the amino acid sequence: 

X, X 2 X 3 X 4 Xj Xg X 7 X 8 Xj X K , X H X, 2 X t3 X t4 X, 5 X Kl [X } ]„ X, 7 X Ji{ X i9 X ZQ 
X 2 ) X 22 X 2 j [Xj] tt X 24 X 25 X 26 X 27 X Jg 

wherein: X; is L, I, V, M, A or P; 

X 2 is any amino acid residue; 

X 3 is P, T or S; 

X, is L. I, V, M, A or P; 

X 5 is any amino acid; 

Xg is any amino acid; 
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X ? isL f I,V,M,A,F,YorW; 

XgisCTorSi 

X<» Is R, K or H; 

X w is any amino acid; 

X u Is any amino acid; 

X !2 is L, I, V, M, A or P; 

Xj j is any amino acid; 

X i4 is any amino acid; 

X ss is any amino acid; 

X w is L, I, V, M, A, P, G, C, T or S; 

|X] tt is a sequence of n amino acids wherein n is from i to 50 amino acids 

and wherein the sequence X ; may comprise the same or different amino 

acids selected from any amino acid residue; 

X }7 is L, I, V, M, A or P; 

X ig is any amino acid; 

X w is any amino acid; 

XjftL.I, V.M.AorP; 

X :: isP: 

X 2J isL, I, V, M, A, Por G; 
X 23 isPorN; 

P9„ is a sequence of s amino acids wherein a is from i to SO amino acids 

and wherein the sequence X i may comprise the same or different amino 

acids selected from any amino acid residue; 

X 2A is L, !, V, M, A or P; 

X 2i is any amino acid; 

X 26 is any amino acid; 

X 27 is Y or F; and 

X n is L t I, V, M, A or P. 

7. A nucleic acid molecule according to claim 6 wherein the protein modulates signal 
transduction. 
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8. A nucleic acid molecule according to claim 7 wherein the signal transduction is modulated 
by a cytokine or a hormone, a microbe or a microbial product, a parasite, an antigen or other 
effector molecule. 

9, A nucleic acid molecule according to claim 8 wherein the protein modulates cytokine- 
mediated signal transduction. 

10, A nucleic acid molecule according to claim 9 wherein the signal transduction is mediated 
by one or more of the cytokines EPO, TPO, Q-CSK GM-CSF, IL-3, 1L-2, IL-4, IL-7, IL-13, 
IL-6, OR 0.-12. IFNy, TNFa, IL-1 and/or M-CSF. 

1 1 . A nucleic acid molecule according to claim 10 wherein the signal transduction is mediated 
by one or more oi'IL-6, UP', OSM, IFN-y and/or fhrombopoietin. 

3.2. A nucleic acid molecuie according to claim 1 1 wherein the signal transduction is mediated 
by IL-6, 

13. A nucleic acid molecule according to claim 1 wherein the nucleotide sequence encodes 
an amino acid sequence substantially as set forth in SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 
8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 14, SEQ ID NO. 18, SEQ ID NO. 21, SEQ 
ID NO. 25, SEQ ID NO, 29, SEQ ID NO. 36, SEQ ID NO. 41, SEQ ID NO. 44, SEQ ID NO. 
46 or SEQ ID NO. 48 or an amino acid sequence having at least about 15% similarity to all or 
part of the listed sequences or a nucleotide sequence which hybridizes to die nucleic acid 
molecule under low stringency conditions at42*G 

14. A nucleic acid molecule according to claim 1 wherein the nucleotide sequence is 
substantially as set forth in SEQ ID NO, 3, SEQ ID NO. 5, SEQ ID NO, 7, SEQ ID NO. 9, SEQ 
ID NO. 1 1 , SEQ ID NO. 13, SEQ ID NO. 15, SEQ ID NO. 16, SEQ ID NO. 17, SEQ ID NO. 
20, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 26, SEQ ID NO. 27, SEQ 
ID NO. 28, SEQ ID NO, 30, SEQ ID NO. 31, SEQ ID NO. 32, SEQ ID NO. 33, SEQ ID NO. 
34, SEQ ID NO. 35, SEQ ID NO. 37, SEQ ID NO. 38, SEQ ID NO. 39, SEQ ID NO. 40, SEQ 
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ID NO. 42, SEQ ID NO. 43, SEQ ID NO. 45 or SEQ ID NO. 47 or a nucleotide sequence 
having at least 15% similarity to all or a part of the listed sequences or a nucleotide sequence 
capable of hybridizing to the listed sequences under low stringency conditions at 42 °C 

1 5 , A nucleic acid molecule comprising a sequence of nucleotides encoding or complementary 
to a sequence encoding a protein or a derivative, homoiogue. analogue or mimetic thereof or a 
nucleotide sequence capable of hybridizing thereto under low stringency conditions at 42 °C 
wherein said protein exhibits the following characteristics: 

(i) comprises a SOCS box in its C-terminal region wherein said SOCS box comprises 
the amino acid sequence: 

X; X 2 X :s X 4 Xj- X 6 X 7 X s X g X w Xjj X,2 Xjj X<4 X i5 X [S sXJj, X !? X !S X|o X 2 o 
Xst ^ X 33 [Xj],, X i4 X25 X 26 X^Xjg 

wherein: X, is L, I, V, M, A or P; 

X 2 is any amino acid residue; 

X 3 is P,TorS; 

X 4 is.L,I, V, M, AorP; 

X, is any ammo acid; 

X 6 is any amino acid; 

X 7 is L, I, V,M, A,F, YorW; 

X s is C, T or S; 

X$ is R, K or H; 

X, 0 is any amino acid; 

X iS is any amino acid; 

X (2 is L, 1, V, M, A or P; 

X J3 is any amino acid; 

X i4 is any amino acid; 

X 5J is any amino acid; 

X w is L, I, V, M, A, P, G, C, T or S; 
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|Xj3 B is a sequence of n amino acids wherein n is from 1 to 50 amino acids 
and wherein the sequence X s may comprise the- same or different amino 
acids selected from any amino acid residue; 
X n is L, i, V, M, A or P; 
X it is any amino acid; 
X w is any amino acid; 
X m L, I, V, M, A or P; 
Xj; is P; 

X 32 isL,L V.M.A.PorG; 
X 23 is PorN; 

[Xjj n is a sequence of n amino acids wherein n is from I to 50 amino acids 

and wherein the sequence Xj may comprise the same or different amino 

acids selected from any amino acid residue; 

X 2i is L, I, V, M, A or P; 

X 25 is any amino acid; 

X 2(S is any amino acid; 

X, 7 is Y or F; 

X 2 jj is L, I, V, M, A or P; and 

(ii) comprises at ieast one of an SH2 domain, WD-40 repeats and/or ankyrin repeats 
or other protein:molecule interacting domain in a region N-tenninal of the SOCS box; 
and 

(in) modulates signal transduction. 

16, An isolated protein or a derivative, homologue or mimetic thereof comprising a SOCS 
box in its C-lerminal .region. 

17. An isolated protein according to claim 16 wherein the protein further comprises a 
proteimmolecule interacting region, 

1 8- An isolated protein according to claim 17 wherein the protem;molecule interacting region 
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is located in a region N-terminal of the SOCS box. 



19, An Isolated protein according to claim 16 or 1.7 wherein, the protein:«jolecule interacting 
region is a proteimDN A binding region or a proteimprotein binding region. 

20, An isolated protein according to claim 19 wherein the proteimmoleeulc interacting region 
is one or more of an SH2 domain, WD-40 repeals or ankyrin repeats. 

21, An isolated protein according to any one of claims 16-20 wherein the SOCS box 
comprises the amino acid sequence: 

Xj X 2 Xj X 4 X 5 X^ X 7 X g X 9 X, 0 X st X s2 X ;3 X; 4 Xj5 X 36 fX,j n X, 7 X ts X, 9 X 2 o 
Xji X ?2 X ?: , [Xj] B X 24 X 2S X J6 X J7 X 2i( 



wherein: X, is L, I, V, M, A or P; 

X 2 is any amino acid residue; 

X 3 is P, T or S; 

X 4 is.L, I, V, M, Aor P; 

X, is any amino acid; 

X 6 is any amino acid; 

X 7 isL, I, V, M, A, F, Y Of W; 

X s is C, T or S; 

X 9 isR t Kor H; 

X l0 is any amino acid; 

X t( is any amino acid; 

X 1S is L, I, V, M, A or P; 

X, 3 is any amino acid; 

X u is any amino acid; 

X l5 is any amino acid; 

X 16 is L, I, V, M, A, P, G, C, T or S; 

rxj„ is a sequence of n amino acids wtea^in n is from 1 to 50 amino acids 
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and wherein the sequence X s may comprise the same or different amino 

acids selected from any amino acid residue; 

X 17 is L, T, V, M, A or P; 

X J(i is any amino acid; 

X J9 is any amino acid; 

X 30 L, I V.M.AorP; 

X 2 < is P; 

X,, is L, 1, V, M, A, P or G; 
X 33 is P or N; 

[Xj] a is a sequence of n amino acids wherein n is from 1 to 50 amino acids 

and wherein the sequence Xj may comprise the same or different amino 

acids selected from any amino acid residue; 

X ?<t is L, I, V, M, A or P; 

X^ is any amino acid; 

X 26 is any amino acid; 

X 2? is Y or F; and 

X 28 is Ul V.M.AorP. 

22. An isolated protein according to claim 21 wherein the protein modulates signal 
transduction. 

23. An isolated protein according to claim 22 wherein the signal transduction is modulated 
by a cytokine or other endogenous molecule, a hormone, a microbe or a microbial product, a 
parasite, an antigen or other effector molecule. 

24, An isolated protein according to claim. 23 wherein the protein modulates cytokine- 
mediated signal transduction, 

25, An isolated protein according to claim 24 wherein the signal transduction is mediated 
by one or more of the cytokines EPO, TPO, G-CSF, GMCSF, IL-3, IL-2, IL-4, IL-7, IL-13, 
IL-6, LIF, IL-12, IPNy, TNFcc, 1L-1 and/or M-CSF. 
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26. An isolated protein according to claim 25 wherein the signal transduction is mediated by 
one or more of IL-6, LIE, GSM, IFN-y and/or thrombopoietin. 

27 . An isolated protein according to claim 26 wherein the signal transduction is mediated by 
IL-6. 

28. An isolated protein according to claim 16 wherein said protein comprises an amino acid 
sequence substantially as set forth in SEQ ID NO, 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID 
NO, 10, SEQ ID NO. 12, SEQ ID NO. 14, SEQ ID NO, 38, SEQ ID NO. 21, SEQ ID NO. 25, 
SEQ ID NO. 29, SEQ ID NO. 36, SEQ ID NO. 41, SEQ ID NO. 44, SEQ ID NO. 46 or SEQ 
ID NO. 48 or an amino acid sequence having at least about 1 5% similarity to all or part of the 
listed sequences, 

29. An isolated protein according to claim 16 wherein the said protein is encoded by a 
nucleotide sequence substantially as set forth in SEQ ID NO. 3, SEQ ID NO. 5, SEQ ID NO. 7, 
SEQ ID NO. 9, SEQ ID NO. 11, SEQ ID NO. 13, SEQ ID NO. 15, SEQ ID NO. 16, SEQ ID 
NO. 17, SEQ ID NO. 20, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 26, 
SEQ ID NO. 27, SEQ ID NO. 28, SEQ ID NO. 30, SEQ ID NO. 31, SEQ ID NO. 32, SEQ ID 
NO. 33, SEQ ID NO. 34, SEQ ID NO. 35, SEQ ID NO. 37, SEQ ID NO. 38, SEQ ID NO. 39, 
SEQ ID NO. 40, SEQ ID NO. 42, SEQ ID NO. 43, SEQ ID NO. 45 or SEQ ID NO. 47 or a 
nucleotide sequence having at least 15% similarity to all or a part of the listed sequences or a 
nucleotide sequence capable of hybridizing to the listed sequences under low stringency 
conditions at 42 °C. 

30. An isolated protein or a derivative, homologue, analogue or mimetic thereof having the 
following characteristics: 

(i) comprises a SOCS box in its C-terrrinal region wherein said SOCS box comprises 
the amino acid sequence; 

X; Xj Xj X* X s X 6 X f Xjj X$ Xjq Xjj X| 2 Xj3 Xj5 Xj$ jXj] a X{7 X jg Xjy Xjq 

X 2 j X 22 Xjj [Xj3 3 Xj 4 Xj,j Xjj Xj-jXjj, 
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wherein: X, is L, I, V, M, A or P; 

X 2 is any amino acid residue; 

X 3 is P, T or S; 

X„ is L, i, V, M, A or P; 

X s is any amino acid; 

X 6 is any amino acid; 

X 7 isL, I,V,M, A.F.YorW; 

X 8 is C, T or S; 

X 5 is R< K or H; 

X i0 is any amino acid; 

X n is any amino acid; 

X J2 is L. I, V, M, A or P; 

X^ is airy amino acid; 

X i4 is any amino acid; 

X, 5 is any amino acid; 

X l6 is L, I, V, M, A, P, <3, C, T or S; 

[X;] R is a sequence of n amino acids wherein a is from 1 to 50 amino acids 
and wherein the sequence X } may comprise the same or different amino 
acids selected from any amino acid residue; 
Xj, is L, I, V, M, A or P; 
X i8 is any amino acid; 
X jy is any amino acid; 
X,<, L, I, V, M, A or P; 
X 2i is P; 

X 22 is L, I, V, M, A, P or G; 
X B is P or N; 

[Xjj :1 is a sequence of a amino acids wherein n is from 1 to 50 amino acids 
and wherein the sequence X { may comprise the same or different amino 
acids selected from any amino acid residue; 
Xj4 is L, I V, M, A or P; 
X 2J is any amino acid; 
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X 2fi is any amino acid; 
X 3y is Y or F; 

X Ig is L, I, V, M, A or P; and 

(a) comprises at. least one of an SH2 domain, WD-40 repeats and/or ankyrin repeats 
or other proteimmolecule interacting domain in a region N-teraiinal of the SOCS box; 
and 

(iii) modulates signal transduction. 

31. A method of modulating levels of a SOCS protein in a cell said method comprising 
contacting a cell containing a SOCS gene with an effective amount, of a modulator of SOCS gene 
expression or SOCS protein activity for a time and under conditions sufficient to modulate levels 
of said SOCS protein. 

32. A method of modulating signal transduction in a cell containing a SOCS gene comprising 
contacting said cell with an effective amount of a modulator of SOCS gene expression or SOCS 
protein activity for a time sufficient to modulate signal transduction. 

33. A method of influencing interaction between cells wherein at feast one ceil carries a SOCS 
gene, said method comprising contacting the cell carrying the SOCS gene wife an effective 
amount of a modulator of SOCS gene expression or SOCS protein activity for a time, sufficient 
to modulate signal transduction. 

34. A method according to any one of claims 3 1-33 wherein signal transduc tion is mediated 
by a cytokine, a hormone, a microbe or a microbial product, a parasite, an antigen or other 
effector molecule.. 

35. A method according to claim 34 wherein the cytokine is one or more of EPO, TPO, O- 
CSF, GM-CSF. IL-3, 1L-2, !L«4, IL-7. IL-13, IL~6 r UF, IL-12, IFNy, TNFcs, IL4 and/or M- 

CSF. 
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36. A method according to claim 35 wherein fee cytokine is one or more of XL-6, OF, QS3V1 
JFN-y and/or thrombopoietin, 

3? . A method according to claim 36 wherein the cytokine is IL-6. 

38. A method according to any one of claims 31 37 wherein the SOCS gene encodes a 
protein having a SOCS box comprising the amino acid sequence: 

X, X 2 X 3 X 4 X s Xg X ? Xg X,j X tf) X,, X 12 X 1} X M X i5 X i6 |X-],, X l7 X 38 X !9 X, e 

x 2[ x 22 x J; ; [Xj3 a x n x, s x 2S x, 7 x 2;( 

wherein: X ; is L, I, V, M, A or P; 

X, is any amino acid residue; 

X 3 is P, T or S; 

X,. is L. I, V, M, A or P; 

X s is any amino acid; 

X 6 is any amino acid; 

X 7 is L, i, V, M, A, F, Y or W; 

X s is C, T or S; 

X 9 is R , K or H; 

Xj 0 is any amino acid; 

X s s is any amino acid; 

X.j is L, I, V, M, A or P; 

X !3 is any amino acid; 

X w is any amino acid; 

X !5 is any amino acid; 

X M is L, I, V, M, A, P, G, C, T or S; 

PQ, is a sequence of n amko acids wherein n is from 1 to 50 amino acids 
and wherein the sequence X } may comprise the same or different amino 
acids selected from any amino acid residue; 
X„isL,.I,V,M,AorP; 
X, 8 is any amino acid; 

SUBSTITUTE SHEET {RULE 26} 



WO 98/20023 



- 194- 

X, s) is any amino acid; 
X 2U L, I, V, M, A or P; 

X 2! is P; 

X 2 is L, I, V, M, A, P or G; 
X 23 is P or N; 

{Xjj B is a sequence of n amino acids wherein n is from t to 50 amino acids 

and wherein the sequence X 3 may comprise the same or different, amino 

acids selected from airy amino acid residue; 

X, 4 is L, I, V, M, A or P; 

X 2S is any amino acid; 

X i6 is any amino acid; 

X a? is Y or F; and 

X^ is L, i, V.M.Aor P. 

39, A method according to claim 38 wherein the SOCS gene comprises a nucleotide 
sequence selected from SEQ ID NO. 3, SEQ ID NO. 5, SEQ ID NO. 7, SEQ ID NO. 9, SEQ 
ID NO. 1 1 , SEQ ID NO, 1 3 , SEQ ID NO. J 5, SEQ ID NO. i 6, SEQ ID NO. I? , SEQ ID NO. 
20, SEQ ID NO. 22, SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 26, SEQ ID NO. 27, SEQ 
ID NO. 28, SEQ ID NO. 30, SEQ ID NO. 31, SEQ FD NO. 32, SEQ ID NO. 33, SEQ ID NO. 
34, SEQ ID NO. 35, SEQ ID NO. 37, SEQ ID NO. 38, SEQ !D NO. 39. SEQ ID NO. 40, SEQ 
ID NO. 42, SEQ ID NO. 43, SEQ ID NO. 45 or SEQ ID NO. 47. 

40, A method according to claim 38 wherein the SOCS gene encodes a protein comprising 
an amino acid sequence substantially as set forth in SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 
8, SEQ ID NO. 10, SEQ ID NO. 12, SEQ ID NO. 14, SEQ ID NO. 18. SEQ ID NO. 21, SEQ 
ID NO. 25, SEQ ID NO. 29, SEQ ID NO. 36, SEQ ID NO. 41, SEQ ID NO. 44, SEQ ID NO. 
46 or SEQ ID NO. 48. 
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. 5 5 9 cgaggcteaagctecgggcgga 1 1 c tgcgt gccgct c t eg 

~IZQ ctccttggggtctgnfcggccgsfcctgtgccacccgsracgcccsfgctcactgcctctgrtct 
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